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(57) The present invention provides polynucleotide 
sequences of the genome of Staphylococcus aureus, 
polypeptide sequences encoded by the polynucleotide 
sequences, corresponding polynucleotides and 
polypeptides, vectors and hosts comprising the polynu- 



cleotides, and assays and other uses thereof. The 
present invention further provides polynucleotide and 
polypeptide sequence information stored on computer 
readable media, and computer-based systems and 
methods which facilitate its use. 
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The present invention relates to the field of molecular biology. In particular, it relates to, among other things, nu- 
cleotide sequences of Staphylococcus aureus, contigs, ORFs, fragments, probes, primers and related polynucleotides 
thereof, peptides and polypeptides encoded by the sequences, and uses of the polynucleotides and sequences thereof, 
such as in fermentation, polypeptide production, assays and pharmaceutical development, among others. 

The genus Staphylococcus includes at least 20 distinct species. (For a review see Novick, R. R, The Staphyloco- 
ccus as a Molecular Genetic System, Chapter 1, pgs. 1-37 in MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, 
R. Novick, Ed., VCH Publishers, New York (1990)). Species differ from one another by 80% or more, by hybridization 
kinetics, whereas strains within a species are at least 90% identical by the same measure. 

The species Staphylococcus aureus, a gram-positive, facultatively aerobic, clump-forming cocci, is among the 
most important etiological agents of bacterial infection in humans, as discussed briefly below. 

Human Health and S. Aureus 

Staphylococcus aureus is a ubiquitous pathogen. (See, for instance, Mims et aL, MEDICAL MICROBIOLOGY, 
Mosby-Year Book Europe Limited, London, UK (1993)). It is an etiological agent of a variety of conditions, ranging in 
severity from mild to fatal. A few of the more common conditions caused by S. aureus infection are burns, cellulitis, 
eyelid infections, food poisoning, joint infections, neonatal conjunctivitis.osteomyelitis, skin infections, surgical wound 
infection, scalded skin syndrome and toxic shock syndrome, some of which are described further below. 



Burns 

Burn wounds generally are sterile initially However, they generally compromise physical and immune barriers to 
infection, cause loss of fluid and electrolytes and result in local or general physiological dysfunction. After cooling, 
contact with viable bacteria results in mixed colonization at the injury site. Infection may be restricted to the non-viable 
debris on the burn surface ("eschar"), it may progress into full skin infection and invade viable tissue below the eschar 
and it may reach below the skin, enter the lymphatic and blood circulation and develop into septicaemia. S. aureus is 
among the most important pathogens typically found in burn wound infections. It can destroy granulation tissue and 
produce severe septicaemia. 



Cellulitis 



Cellulitis, an acute infection of the skin that expands from a typically superficial origin to spread below the cutaneous 
35 layer, most commonly is caused by S. aureus in conjunction with S. pyrogenes. Cellulitis can lead to systemic infection. 
In fact, cellulitis can be one aspect of synergistic bacterial gangrene. This condition typically is caused by a mixture of 
S. aureus and microaerophilic streptococci. It causes necrosis and treatment is limited to excision of the necrotic tissue. 
The condition often is fatal. 

-*o Eyelid infections 



S. aureus is the cause of styes and of sticky eye" in neonates, among other eye infections. Typically such infections 
are limited to the surface of the eye, and may occasionally penetrate the surface with more severe consequences. 

45 Food poisoning 



Some strains of S. aureus produce one or more of five serologically distinct, heat and acid stable enterotoxins that 
are not destroyed by digestive process of the stomach and small intestine (enterotoxins A-E). Ingestion of the toxin, 
in sufficient quantities, typically results in severe vomiting, but not diarrhoea. The effect does not require viable bacteria. 
Although the toxins are known, their mechanism of action is not understood. 



55 



Joint infections 

S. aureus infects bone joints causing diseases such osteomyelitis. 
Osteomyelitis 

S. aureus is the most common causative agent of haematogenous osteomyelitis. The diseas tends to occur in 
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children and adolescents more than adults and it is associated with non -penetrating injuries to bones. Infection typically 
occurs in the long end of growing bone, hence its occurrence in physically immature populations. Most often, infection 
is localized in the vicinity of sprouting capillary loops adjacent to epiphysial growth plates in the end of long, growing 
bones. 

s 

Skin infections 

S. aureus is the most common pathogen of such minor skin infections as abscesses and boils. Such infections 
often are resolved by normal host response mechanisms, but they also can develop into severe internal infections. 
70 Recurrent infections of the nasal passages plague nasal carriers of S. aureus. 

Surgical Wound Infections 

Surgical wounds often penetrate far into the body. Infection of such wound thus poses a grave risk to the patient. 
is s. aureus is the most important causative agent of infections in surgical wounds. S. aureus is unusually adept at 
invading surgical wounds; sutured wounds can be infected by far fewer S. aureus cells then are necessary to cause 
infection in normal skin. Invasion of surgical wound can lead to severe S. aureus septicaemia. Invasion of the blood 
stream by S. aureus can lead to seeding and infection of internal organs, particularly heart valves and bone, causing 
systemic diseases, such as endocarditis and osteomyelitis. 

20 

Scalded Skin Syndrome 

S. aureus is responsible for "scalded skin syndrome" (also called toxic epidermal necrosis, Ritter/s disease and 
Lyell's disease). This diseases occurs in older children, typically in outbreaks caused by flowering of S. aureus strains 
25 produce exfoliation (also called scalded skin syndrome toxin). Although the bacteria initially may infect only a minor 
lesion, the toxin destroys intercellular connections, spreads epidermal layers and allows the infection to penetrate the 
outer layer of the skin, producing the desquamation that typifies the diseases. Shedding of the outer layer of skin 
generally reveals normal skin below, but fluid lost in the process can produce severe injury in young children if it is not 
treated properly. 

30 

Toxic Shock Syndrome 

Toxic shock syndrome is caused by strains of S. aureus that produce the so-called toxic shock syndrome toxin. 
The disease can be caused by S. aureus infection at any site, but it is too often erroneously viewed exclusively as a 
3S disease solely of women who use tampons. The disease involves toxaemia and septicaemia, and can be fatal. 

Nocosomial Infections 

In the 1984 National Nocosomial Infection Surveillance Study ( a NNIS°) S. aureus was the most prevalent agent 
40 of surgical wound infections in many hospital services, including medicine, surgery, obstetrics, pediatrics and newborns. 

Resistance to drugs of S. aureus strains 

Prior to the introduction of penicillin the prognosis for patients seriously infected with S. aureus was unfavorable. 
45 Following the introduction of penicillin in the early 1 940s even the worst S. aureus infections generally could be treated 
successfully. The emergence of penicillin-resistant strains of S. aureus did not take long, however Most strains of S. 
aureus encountered in hospital infections today do not respond to penicillin; although, fortunately, this is not the case 
for S. aureus encountered in community infections. 

It is well known now that penicillin-resistant strains of S. aureus produce a lactamase which converts penicillin to 
so pencillinoic acid, and thereby destroys antibiotic activity. Furthermore, the lactamase gene often is propagated episo- 
mally, typically on a plasmid, and often is only one of several genes on an episomat element that, together, confer 
multidrug resistance. 

Methicillins, introduced in the 1 960s, largely overcame the problem of penicillin resistance in S. aureus. These 
compounds conserve th portions of penicillin r sponsible for antibiotic activity and modify or after other portions that 
ss make penicillin a good substrat for inactivating lactamases. However, methicillin resistance has emerged in S. aureus, 
along with resistance to many other antibiotics effective against this organism, including aminoglycosides, tetracycline, 
chloramphenicol, macrolides and lincosamides. In fact, methicillin-resistant strains of S. aureus generally are multiply 
drug resistant. 
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The molecular genetics of most types of drug resistance in S. aureus has been elucidated (See Lyon era/., Micro- 
biology Reviews 5±: 88-1 34 (1 987)). Generally, resistance is mediated by plasmids, as noted above regarding penicillin 
resistance; however, several stable forms of drug resistance have been observed that apparently involve integration 
of a resistance element into the S. aureus genome itself. 
5 Thus far each new antibiotic gives rise to resistance strains, stains emerge that are resistance to multiple drugs 

and increasingly persistent forms of resistance begin to emerge. Drug resistance of S. aureus infections already poses 
significant treatment difficulties, which are likely to get much worse unless new therapeutic agents are developed. 

Molecular Genetics of Staphylococcus Aureus 

10 

Despite its importance in, among other things, human disease, relatively little is known about the genome of this 
organism. 

Most genetic studies of S. aureus have been carried out using the the strain NCTC8325, which contains prophages 
psi11 psi12 and psi13, and the UV-cured derivative of this strain, 8325-4 (also referred to as RN450), which is free of 
1$ the prophages. 

These studies revealed that the S. aureus genome, like that of other staphylococci, consists of one circular, cov- 
alently closed, double-stranded DNA and a collection of so-called variable accessory genetic elements, such as 
prophages, plasmids, transposons and the like. 

Physical characterization of the genome has not been carried out in any detail. Pattee et ai published a low res- 

20 olution and incomplete genetic and physical map of the chromosome of S. aureus strain NCTC 8325. (Pattee et ai 
Genetic and Physical Mapping of Chromosome of Staphylococcus aureus NCTC 8325, Chapter 11, pgs. 163-169 in. 
MOLECULAR BIOLOGY OF THE STAPHYLOCOCCI, R.P Novick, Ed., VCH Publishers, New York, (1990) The genetic 
map largely was produced by mapping insertions of Tn551 and Tn400l, which, respectively, confer erythromycin and 
gentamicin resistance, and by analysis of Smal-digested DNA by Pulsed Field Gel Electrophoresis ("PFGE"). 

25 The map was of low resolution; even estimating the physical size of the genome was difficult, according to the 

investigators. The size of the largest Smal chromosome fragment, for instance, was too large for accurate sizing by 
PFGE. To estimate its size, additional restriction sites had to be introduced into the chromosome using a transposon 
containing a Smal recognition sequence. 

In sum, most physical characteristics and almost all of the genes of Staphylococcus aureus are unknown. Among 

30 the few genes that have been identified, most have not been physically mapped or characterized in detail. Only a very 
few genes of this organism have been sequenced. (See, for instance Thornsberry, J. , Antimicrobial Chemotherapy 2 1 
Suppl C : 9-16 (1988), current versions of GENBANK and other nucleic acid databases, and references that relate to 
the genome of S. aureus such as those set out elsewhere herein.) 

It is clear that the etiology of diseases mediated or exacerbated by S aureus infection involves the programmed 

35 expression of S. aureus genes, and that characterizing the genes and their patterns of expression would add dramat- 
ically to our understanding of the organism and its host interactions. Knowledge of S. aureus genes and genomic 
organization would dramatically improve understanding of disease etiology and. lead to improved and new ways of 
preventing, ameliorating, arresting and reversing diseases. Moreover, characterized genes and genomic fragments of 
S. aureus would provide reagents for, among other things, detecting, characterizing and controlling S. aureus infections. 

40 There is a need therefore to characterize the genome of S. aureus and for polynucleotides and sequences of this 
organism. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome. The primary 
nucleotide sequences which were generated are provided in SEQ ID NOS: 1-5,191. 

The present invention provides the nucleotide sequence of several thousand contigs of the Staphylococcus aureus 
^5 genome, which are listed in tables below and set out in the Sequence Listing submitted herewith, and representativ 
fragments thereof, in a form which can be readily used, analyzed, and interpreted by a skilled artisan. In one embod- 
iment, the present invention is provided as contiguous strings of primary sequence information corresponding to the 
nucleotide sequences depicted in SEQ ID NOS: 1-5, 191. 

The present invention further provides nucleotide sequences which are at least 95%, preferably 99% and most 
50 preferably 99.9%, identical to the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 . 

The nucleotide sequence of SEQ ID NOS:1-5,191, a representative fragment thereof, or a nucleotide sequence 
which is at least 95%, preferably 99% and most preferably 99.9%, identical to the nucleotide sequence of SEQ ID 
NOS: 1 -5,1 91 may be provided in a variety of mediums to facilitate its use. In one application of this embodiment, the 
sequ nc s of the present invention are recorded on computer readable media. Such media includes, but is not limited 
55 to:magn tic storage media, such as floppy discs, hard disc storage medium, and magnetic tape; optical storage media 
such as CD-ROM; electrical storag media such as RAM and ROM; and hybrids of these categori s such as magnetic/ 
optical storage media. 

The present invention further provides systems, particularly computer-based systems which contain the sequence 
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information herein described stored in a data storage means. Such systems are designed to identify "commercially 
important fragments of th Staphylococcus aureus genome. 

Another embodiment of the present invention is directed to fragments, preferably isolated fragments, of the Sta- 
phylococcus aureus genome having particular structural or functional attributes. Such fragments of the Staphylococcus 
s aureus genome of the present invention include, but are not limited to, fragments which encode peptides, hereinafter 
referred to as open reading frames or ORFs,' fragments which modulate the expression of an operably linked ORR 
hereinafter referred to as expression modulating fragments or EMFs," and fragments which can be used to diagnose 
the presence of Staphylococcus aureus in a sample, hereinafter referred to as diagnostic fragments or 'DFs/ 

Each of the ORFs in fragments of the Staphylococcus aureus genome disclosed in Tables 1-3, and the EMFs 
io found 5' to the ORFs, can be used in numerous ways as polynucleotide reagents. For instance, the sequences can be 
used as diagnostic probes or amplification primers for detecting or determining the presence of a specific microbe in 
a sample, to selectively control gene expression in a host and in the production of polypeptides, such as polypeptides 
encoded by ORFs of the present invention, particular those polypeptides that have a pharmacological activity 

The present invention further includes recombinant constructs comprising one or more fragments of the Staphy- 
'5 lococcus aureus genome of the present invention. The recombinant constructs of the present invention comprise vec- 
tors, such as a plasmid or viral vector, into which a fragment of the Staphylococcus aureus has been inserted. 

The present invention further provides host cells containing any of the isolated fragments of the Staphylococcus 
aureus genome of the present invention. The host cells can be a higher eukaryotic host cell, such as a mammalian 
cell, a lower eukaryotic cell, such as a yeast cell, or a procaryotic cell such as a bacterial cell. 
20 The present invention is further directed to polypeptides and proteins, preferably isolated polypeptides and pro- 

teins, encoded by ORFs of the present invention. A variety of methods, well known to those of skill in the art, routinely 
may be utilized to obtain any of the polypeptides and proteins of the present invention. For instance, polypeptides and 
proteins of the present invention having relatively short, simple amino acid sequences readily can be synthesized using 
commercially available automated peptide synthesizers. Polypeptides and proteins of the present invention also may 
2S be purified from bacterial cells which naturally produce the protein. Yet another alternative is to purify polypeptide and 
proteins of the present invention can from cells which have been altered to express them. 

The invention further provides polypeptides, preferably isolated polypeptides, comprising Staphylococcus aureus 
epitopes and vaccine compositions comprising such polypeptides. Also provided are methods for vacciniating an in- 
dividual against Staphylococcus aureus infection. 
30 The invention further provides methods of obtaining homologs of the fragments of the Staphylococcus aureus 

genome of the present invention and homologs of the proteins encoded by the ORFs of the present invention. Specif- 
ically, by using the nucleotide and amino acid sequences disclosed herein as a probe or as primers, and techniques 
such as PGR cloning and colony/plaque hybridization, one skilled in the art can obtain homologs. 

The invention further provides antibodies which selectively bind polypeptides and proteins of the present invention. 
3$ Such antibodies include both monoclonal and polyclonal antibodies. 

The invention further provides hybridomas which produce the above-described antibodies. A hybridoma is ."an 
immortalized cell line which is capable of secreting a specific monoclonal antibody. 

The present invention further provides methods of identifying test samples derived from cells which express one 
of the ORFs of the present invention, or a homolog thereof. Such methods comprise incubating a test sample with one 
to or more of the antibodies of the present invention, or one or more of the Dfs or antigens of the present invention, under 
conditions which allow a skilled artisan to determine if the sample contains the ORF or product produced therefrom. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 
out the above-described assays. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
45 which comprises: (a) a first container comprising one of the antibodies, antigens, or one of the DFs of the present 
invention; and (b) one or more other containers comprising one or more of the following: wash reagents, reagents 
capable of detecting presence of bound antibodies, antigens or hybridized DFs. 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
and identifying agents capable of binding to a polypeptide or protein encoded by one of the ORFs of the present 
50 invention. Specifically, such agents include, as further described below, antibodies, peptides, carbohydrates, pharma- 
ceutical agents and the like. Such methods comprise steps of : (a)contacting an agent with an isolated protein encoded 
by one of the ORFs of the present invention; and (b)determining whether the agent binds to said protein. 

The present genomic sequences of Staphylococcus aureus will be of great value to all laboratories working with 
this organism and for a variety of commercial purposes. Many fragm nts of th Staphylococcus aureus g nome will 
55 be immediately identified by similarity searches against GenBank or protein databases and will be of immediat value 
to Staphylococcus aureus researchers and for immediate commercial value for the production of prot ins or to control 
gene xpresston. 

The methodology and technology for elucidating extensiv genomic sequences of bacterial and other genomes 
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has and will greatly enhance the ability to analyze and understand chromosomal organization. In particular, sequenced 
contigs and genomes will provide the models for developing tools for the analysis of chromosome structure and function, 
including the ability to identify genes within large segments of genomic DNA, the structure, position, and spacing of 
regulatory elements, the identification of genes with potential industrial applications, and the ability to do comparative 
5 genomic and molecular phylogeny. 

FIGURE 1 is a block diagram of a computer system (102) that can be used to implement computer-based systems 
of present invention. 

FIGURE 2 is a schematic diagram depicting the data flow and computer programs used to collect, assemble, edit 
and annotate the contigs of the Staphylococcus aureus genome of the present invention. Both Macintosh and Unix 
10 platforms are used to handle the AB 373 and 377 sequence data files, largely as described in Kerlavage et at., Pro- 
ceedings of the Twenty-Sixth Annual Hawaii International Conference on System Sciences, 585, IEEE Computer So- 
ciety Press, Washington D.C. (1993). Factura (AB) is a Macintosh program designed for automatic vector sequence 
removal and end-trimming of sequence files. The program Loadis runs on a Macintosh platform and parses the feature 
data extracted from the sequence files by Factura to the Unix based Staphylococcus aureus relational database. As- 
'5 sembly of contigs (and whole genome sequences) is accomplished by retrieving a specific set of sequence files and 
their associated features using extrseq, a Unix utility for retrieving sequences from an SQL database. The resulting 
sequence file is processed by seq_filter to trim portions of the sequences with more than 2% ambiguous nucleotides. 
The sequence files were assembled using TIGR Assembler, an assembly engine designed at The Institute for Genomic 
Research ( TIGR") for rapid and accurate assembly of thousands of sequence fragments. The collection of contigs 
generated by the assembly step is loaded into the database with the lassie program. Identification of open reading 
frames (ORFs) is accomplished by processing contigs with zorf. The QRFs are searched against S. aureus sequences 
from Genbank and against all protein sequences using the BLASTN and BLASTP programs, described in AKschul et 
ai, J. Mol. Biol. 215: 403-410 (1990)). Results of the ORF determination and similarity searching steps were loaded 
into the database. As described below, some results of the determination and the searches are set out in Tables 1-3. 

The present invention is based on the sequencing of fragments of the Staphylococcus aureus genome and analysis 
of the sequences. The primary nucleotide sequences generated by sequencing the fragments are provided in SEQ ID 
NOS:1-5,191. (As used herein, the "primary sequence" refers to the nucleotide sequence represented by the IUPAC 
nomenclature system.) 

In addition to the aforementioned Staphylococcus aureus polynucleotide and polynucleotide sequences, the 
30 present invention provides the nucleotide sequences of SEQ ID NOS:1 -5, 1 91 , or representative fragments thereof, in 
a form which can be readily used, analyzed, and interpreted by a skilled artisan. 

As used herein, a "representative fragment of the nucleotide sequence depicted in SEQ ID N0S:1-5,191" refers 
to any portion of the SEQ ID NOS:1-5,191 which is not presently represented within a publicly available database. 
Preferred representative fragments of the present invention are Staphylococcus aureus open reading frames ( ORFs"), 
35 expression modulating fragment ( EMFs") and fragments which can be used to diagnose the presence of Staphyloco- 
ccus aureus in sample ("DFs"). A non-limiting identification of preferred representative fragments is provided in Tables 
1-3. 

As discussed in detail below, the information provided in SEQ ID NOS:1 -5,191 and in Tables 1-3 together with 
routine cloning, synthesis, sequencing and assay methods will enable those skilled in the art to clone and sequence 
40 all "representative fragments' of interest, including open reading frames encoding a large variety of Staphylococcus 
aureus proteins. 

While the presently disclosed sequences of SEQ ID NOS:1-5,191 are highly accurate, sequencing techniques are 
not perfect and, in relatively rare instances, further investigation of a fragment or sequence of the invention may reveal 
a nucleotide sequence error present in a nucleotide sequence disclosed in SEQ ID NOS:1-5,191 .. However, once the 

45 present invention is made available (i.e., once the information in SEQ ID N0S:1-5,191 and Tables 1-3 has been made 
available), resolving a rare sequencing error in SEQ ID NOS:1-5,191 will be well within the skill of the art. The present 
disclosure makes available sufficient sequence information to allow any of the described contigs or portions thereof to 
be obtained readily by straightforward application of routine techniques. Further sequencing of such polynucleotide 
may proceed in like manner using manual and automated sequencing methods which are employed ubiquitous in the 

so art. Nucleotide sequence editing software is publicly available. For example, Applied Biosystem's (AB) AutoAssembler 
can be used as an aid during visual inspection of nucleotide sequences. By employing such routine techniques potential 
errors readily may be identified and the correct sequence then may be ascertained by targeting further sequencing 
effort, also of a routine nature, to the region containing the potential error. 

Even if all of the very rar sequencing errors in SEQ ID NOS:1-5,191 were corrected, th resulting nucleotide 

55 sequences would still b at least 95% id ntical, nearly all would b at least 99% identical, and the great majority would 
be at least 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191. 

As discussed elsewhere hererin, polynucleotides of the present invention readily may be obtained by routine ap- 
plication of well known and standard procedures for cloning and sequencing DNA. Detailed methods for obtaining 
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libraries and for sequencing are provided below, for instance. A wide variety of Staphylococcus aureus strains that can 
be used to prepare S aureus genomic DNA for cloning and for obtaining polynucleotides of the present invention are 
available to the public from recognized depository institutions, such as the American Type Culture Collection (ATCC"). 

The nucleotide sequences of the genomes from different strains of Staphylococcus aureus differ somewhat. How- 
5 ever, the nucleotide sequenc s of the genomes of all Staphylococcus aureus strains will be at least 95% identical, in 
corresponding part, to the nucleotide sequences provided in SEQ ID NOS:1-5,191. Nearly all will be at least 99% 
identical and the great majority will be 99.9% identical. 

Thus, the present invention further provides nucleotide sequences which are at least 95%, preferably 99% and 
most preferably 99.9% identical to the nucleotide sequences of SEQ ID NOS:1-5,191, in a form which can be readily 
10 used, analyzed and interpreted by the skilled artisan. 

Methods for determining whether a nucleotide sequence is at least 95%, at least 99% or at least 99.9% identical 
to the nucleotide sequences of SEQ ID NOS: 1 -5, 1 91 are routine and readily available to the skilled artisan. For example, 
the well known fasta algorithm described in Pearson and Lipman, Proc. Natl. Acad. Sci. USA 85: 2444 (1988) can be 
used to generate the percent identity of nucleotide sequences. The BLASTN program also can be used to generate 
*5 an identity score of polynucleotides compared to one another. 

COMPUTER RELATED EMBODIMENTS 

The nucleotide sequences provided in SEQ ID NOS: 1-5, 191, a representative fragment thereof, or a nucleotide 
20 sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical to a polynucleotide se- 
quence of SEQ ID NOS: 1-5, 191 may be "provided" in a variety of mediums to facilitate use thereof. As used herein, 
Oprovided" refers to a manufacture, other than an isolated nucleic acid molecule, which contains a nucleotide sequence 
of the present invention; i.e., a nucleotide sequence provided in SEQ ID NOS:1 -5,191, a representative fragment 
thereof, or a nucleotide sequence at least 95%, preferably at least 99% and most preferably at least 99.9% identical 
to a polynucleotide of SEQ I D NOS: 1 -5, 1 91 . Such a manufacture provides a large portion of the Staphylococcus aureus 
genome and parts thereof (e.g., a Staphylococcus aureus open reading frame (ORF)) in a form which allows a skilled 
artisan to examine the manufacture using means not directly applicable to examining the Staphylococcus aureus ge- 
nome or a subset thereof as it exists in nature or in purified form. 

In one application of this embodiment, a nucleotide sequence of the present invention can be recorded on computer 
30 readable media. As used herein, "computer readable media" refers to any medium which can be read and accessed 
directly by a computer. Such media include, but are not limited to: magnetic storage media, such as floppy discs, hard 
disc storage medium, and magnetic tape; optical storage media such as CD- ROM; electrical storage media such as 
RAM and ROM; and hybrids of these categories, such as magnetic/optical storage media. A skilled artisan can readily 
appreciate how any of the presently known computer readable mediums can be used to create a manufacture com- 
prising computer readable medium having recorded thereon a nucleotide sequence of the present invention. Likewise, 
it will be clear to those of skill how additional computer readable media that may be developed also can be usedlo 
create analogous manufactures having recorded thereon a nucleotide sequence of the present invention. 

As used herein, "recorded" refers to a process for storing information on computer readable medium. A skilled 
artisan can readily adtopt any of the presently know methods for recording information on computer readable medium 
to to generate manufactures comprising the nucleotide sequence information of the present invention. 

A variety of data storage structures are available to a skilled artisan for creating a computer readable medium 
having recorded thereon a nucleotide sequence of the present invention. The choice of the data storage structure will 
generally be based on the means chosen to access the stored information. In addition, a variety of data processor 
programs and formats can be used to store the nucleotide sequence information of the present invention on computer 
readable medium. The sequence information can be represented in a word processing text file, formatted in commer- 
cially- available software such as WordPerfect and Microsoft Word, or represented in the form of an ASCII file, stored 
in a database application, such as DB2, Sybase, Oracle, or the like. A skilled artisan can readily adapt any number of 
data-processor structuring formats (e.g., text file or database) in order to obtain computer readable medium having 
recorded thereon the nucleotide sequence information of the present invention. 

Computer software is publicly available which allows a skilled artisan to access sequence information provided in 
a computer readable medium. Thus, by providing in computer readable form the nucleotide sequences of SEQ ID 
NOS:1-5.191, a representative fragment thereof, or a nucleotide sequence at least 95%, preferably at least 99% and 
most preferably at least 99.9% identical to a sequence of SEQ ID NOS:1-5,191 the present invention enables the 
skilled artisan routinely to access the provided sequence information for a wide variety of purpos s. 

The exampl s which follow demonstrate how software which implements th BLAST (Altschul et ai, J. Mol. Biol. 
215:403410 (1990)) and BLAZE (Brutlag et ai, Comp. Chem. 17:203-207 (1993)) search algorithms on a Sybase 
system was used to id ntify open reading frames (ORFs) within the Staphylococcus aureus genome which contain 
homology to ORFs or proteins from both Staphylococcus aureus and from other organisms. Among the ORFs discussed 
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herein are protein encoding fragments of the Staphylococcus aureus genome useful in producing commercially impor- 
tant proteins, such as enzymes used in fermentation reactions and in the production of commercially useful metabolites 
The present invention further provides systems, particularly computer-based systems, which contain the sequence 
information described herein. Such systems are designed to identify, among other things, commercially important f rag- 
s ments of the Staphylococcus aureus genome. 

As used herein, "a computer-based system" refers to the hardware means, software means, and data storage 
means used to analyze the nucleotide sequence information of the present invention. The minimum hardware means 
of the computer-based systems of the present invention comprises a central processing unit (CPU), input means, 
output means, and data storage means. A skilled artisan can readily appreciate that any one of the currently availabl 
10 computer-based system are suitable for use in the present invention. 

As stated above, the computer-based systems of the present invention comprise a data storage means having 
stored therein a nucleotide sequence of the present invention and the necessary hardware means and software means 
for supporting and implementing a search means. 

As used herein, "data storage means" refers to memory which can store nucleotide sequence information of the 
present invention, or a memory access means which can access manufactures having recorded thereon the nucleotide 
sequence information of the present invention. 

As used herein, "search means" refers to one or more programs which are implemented on the computer- based 
system to compare a target sequence or target structural motif with the sequence information stored within the data 
storage means. Search means are used to identify fragments or regions of the present genomic sequences which 
match a particular target sequence or target motif. A variety of known algorithms are disclosed publicly and a variety 
of commercially available software for conducting search means are and can be used in the computer-based systems 
of the present invention. Examples of such software includes, but is not limited to, MacPattern (EMBL), BLASTN and 
BLASTX (NCBIA). A skilled artisan can readily recognize that any one of the available algorithms or implementing 
software packages for conducting homology searches can be adapted for use in the present computer-based systems. 

As used herein, a "target sequence" can be any DNA or amino acid sequence of six or more nucleotides or two 
or more amino acids. A skilled artisan can readily recognize that the longer a target sequence is, the less likely a target 
sequence will be present as a random occurrence in the database. The most preferred sequence length of a target 
sequence is from about 1 0 to 1 00 amino acids or from about 30 to 300 nucleotide residues. However, it is well recognized 
that searches for commercially important fragments, such as sequence fragments involved in gene expression and 
30 protein processing, may be of shorter length. 

As used herein, "a target structural motif," or "target motif," refers to any rationally selected sequence or combi- 
nation of sequences in which the sequence(s) are chosen based on a three-dimensional configuration which is formed 
upon the folding of the target motif. There are a variety of target motifs known in the art. Protein target motifs include, 
but are not limited to, enzymic active sites and signal sequences. Nucleic acid target motifs include, but are not limited 
to, promoter sequences, hairpin structures and inducible expression elements (protein binding sequences), 

A variety of structural formats for the input and output means can be used to input and output the information in 
the computer-based systems of the present invention. A preferred format for an output means ranks fragments of the 
Staphylococcus aureus genomic sequences possessing varying degrees of homology to the target sequence or target 
motif. Such presentation provides a skilled artisan with a ranking of sequences which contain various amounts of the 
target sequence or target motif and identifies the degree of homology contained in the identified fragment. 

A variety of comparing means can be used to compare a target sequence or target motif with the data storage 
means to identify sequence fragments of the Staphylococcus aureus genome. In the present examples, implementing 
software which implement the BLAST and BLAZE algorithms, described in Altschul era/., J. Moi Biol. 215: 403-410 
(1990), was used to identify open reading frames within the Staphylococcus aureus genome. A skilled artisan can 
readily recognize that any one of the publicly available homology search programs can be used as the search means 
for the computer-based systems of the present invention. Of course, suitable proprietary systems that may be known 
to those of skill also may be employed in this regard. 

Figure 1 provides a block diagram of a computer system illustrative of embodiments of this aspect of present 
invention. The computer system 102 includes a processor 1 06 connected to a bus 104. Also connected to the bus 104 
are a main memory 1 08 (preferably implemented as random access memory, RAM) and a variety of secondary storage 
devices 110, such as a hard drive 112 and a removable medium storage device 114. The removable medium storage 
device 114 may represent, for example, a floppy disk drive, a CD-ROM drive, a magnetic tape drive, etc. A removable 
storage medium 116 (such as a floppy disk, a compact disk, a magnetic tape, etc.) containing control logic anoVor data 
recorded therein may b inserted into the removable medium storage devic 114. The comput r system 102 ncfudes 
appropriate software for reading the control logic and/or the data from the removabl medium storage device 1 1 4 . once 
it is inserted into the r movable medium storage device 114. 

A nucl otide sequence of the present invention may be stored in a well known manner in the main memory 108. 
any of th secondary storage devices 110, and/or a removable storage medium 116. During execution, software lor 
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accessing and processing the genomic sequence (such as search tools, comparing tools, etc.) reside inrnain memory 
108, in accordance with the requirements and operating parameters of the operating system, the hardware system 
and the software program or programs. 

BIOCHEMICAL EMBODIMENTS 

Other embodimenis of the present invention are directed to fragments of the Staphylococcus aureus genome, 
preferably to isolated fr igments. The fragments of the Staphylococcus aureus genome of the present invention include, 
but are not limited to iragments which encode peptides, hereinafter open reading frames (ORFs), fragments which 
modulate the expression of an operably linked ORR hereinafter expression modulating fragments (EMFs) and frag- 
ments which can be used to diagnose the presence of Staphylococcus aureus in a sample, hereinafter diagnostic 
fragments (DFs). 

As used herein, an "isolated nucleic acid molecule" or an "isolated fragment of the Staphylococcus aureus genome* 
refers to a nucleic acid molecule possessing a specific nucleotide sequence which has been subjected to purification 
*5 means to reduce, from the composition, the number of compounds which are normally associated with the composition. 
Particularly, the term refers to the nucleic acid molecules having the sequences set out in SEQ ID NOS:1-5,191, to 
representative fragments thereof as described above, to polynucleotides at least 95%, preferably at least 99% and 
especially preferably at least 99.9% identical in sequence thereto, also as set out above. 

A variety of purification means can be used to generated the isolated fragments of the present invention. These 
20 include, but are not limited to methods which separate constituents of a solution based on charge, solubility, or size. 

In one embodiment, Staphylococcus aureus DNA can be mechanically sheared to produce fragments of 1 5-20 kb 
in length. These fragments can then be used to generate an Staphylococcus aureus library by inserting them into 
lambda clones as described in the Examples below. Primers flanking,*for example, an ORR such as those enumerated 
in Tables 1-3 can then be generated using nucleotide sequence information provided in SEQ ID NOS: 1-5,191. Well 
25 known and routine techniques of PCR cloning then can be used to isolate thepRF from the lambda DNA library of 
Staphylococcus aureus genomic DNA. Thus, given the availability of SEQ ID NOS:1 -5,191 , the information in Tables 
1, 2 and 3, and the information that may be obtained readily by analysis of the sequences of SEQ ID NOS:1-5,191 
using methods set out above, those of skill will be enabled by the present disclosure to isolate any ORF-containing or 
other nucleic acid fragment of the present invention. 
30 The isolated nucleic acid molecules of the present invention include, but are not limited to single stranded and 

double stranded DNA, and single stranded RNA. 

As used herein, an "open reading frame," ORR means a series of triplets coding for amino acids without any 
termination codons and is a sequence translatable into protein. 

Tables 1, 2 and 3 list ORFs in the Staphylococcus aureus genomic contigs of the present invention that were 
35 identified as putative coding regions by the GeneMark software using organism-specific second-order Markov proba- 
bility transition matrices. It will be appreciated that other criteria can be used, in accordance with well known analytical 
methods, such as those discussed herein, to generate more inclusive, more restrictive or more selective lists. 

Table 1 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are at least 80 amino 
acids long and over a continuous region of at least 50 bases which are 95% or more identical (by BLAST analysis) to 
•40 an S. aureus nucleotide sequence available through Genbank in November 1996. 

Table 2 sets out ORFs in the Staphylococcus aureus contigs of the present invention that are not in Table 1 and 
match, with a BLASTP probability score of 0.01 or less, a polypeptide sequence available through Genbank by Sep- 
tember 1996. 

Table 3 sets out ORFs in the Staphylococcus aureus contigs of the present invention that do not match significantly, 
4$ by BLASTP analysis, a polypeptide sequence available through Genbank by September 1996. 

In each table, the first and second columns identify the ORF by, respectively, contig number and ORF number 
within the contig; the third column indicates the reading frame, taking the first 5' nucleotide of the contig as the start of 
the +1 frame; the fourth column indicates the first nucleotide of the ORF, counting from the 5' end of the contig strand; 
and the fifth column indicates the length of each ORF in nucleotides. 
50 In Tables 1 and 2, column six, lists the Reference" for the closest matching sequence available through Genbank. 

These reference numbers are the databases entry numbers commonly used by those of skill in the art, who will be 
familiar with their denominators. Descriptions of the numenclature are available from the National Center for Biotech- 
nology Information. Column seven in Tables 1 and 2 provides the gene name' of the matching sequence; column eight 
provides the BLAST identity" score from the comparison of the ORF and the homologous gene; and column nine 
55 indicat s th length in nucleotid s of the highest scoring segment pair' identified by the BLAST identity analysis. 
In Table 3, the last column, column six, indicates the length of each ORF in amino acid residues. 
The concepts of percent identity and percent similarity of two polyp ptide sequences is well understood in the art. 
For example, two polypeptides 10 amino acids in length which differ at three amino acid positions (e.g., at positions 



9 



^NSDOCID: <EP 078651SA2.I_> 



EP0 786 519 A2 



1 , 3 and 5) are said to have a percent identity of 70%. However, the same two polypeptides would be deemed to have 
a percent similarity of 80% if, for example at position 5, the amino acids moieti s, although not identical, were "similar* 
(i.e., possessed similar biochemical characteristics). Many programs for analysis of nucleotide or amino acid sequence 
similarity, such as fasta and BLAST specifically list per cent identity of a matching region as an output parameter. Thus, 

5 for instance, Tables 1 and 2 herein enumerate the per cent identity" of the highest scoring segment pair" in each ORF 
and its listed relative. Further details concerning the algorithms and criteria used for homology searches are provided 
below and are described in the pertinent literature highlighted by the citations provided below. 

It will be appreciated that other criteria can be used to generate more inclusive and more exclusive listings of the 
types set out in the tables. As those of skill will appreciate, narrow and broad searches both are useful. Thus, a skilled 

io artisan can readily identify ORFs in contigs of the Staphylococcus aureus genome other than those listed in Tables 
1 -3, such as ORFs which are overlapping or encoded by the opposite strand of an identified ORF in addition to those 
ascertainable using the computer-based systems of the present invention. 

As used herein, an "expression modulating fragment," EMF, means a series of nucleotide molecules which mod- 
ulates the expression of an operably linked ORF or EMF. 

75 As used herein, a sequence is said to "modulate the expression of an operably linked sequence" when the ex- 

pression of the sequence is altered by the presence of the EMF. EMFs include, but are not limited to, promoters, and 
promoter modulating sequences (inducible elements). One class of EMFs are fragments which induce the expression 
or an operably linked ORF in response to a specific regulatory factor or physiological event. 

EMF sequences can be identified within the contigs of the Staphylococcus aureus genome by their proximity to 

20 the ORFs provided in Tables 1-3. An intergenic segment, or a fragment of the intergenic segment, from about 10 to 
200 nucleotides in length, taken from any one of the ORFs of Tables 1 -3 will modulate the expression of an operably 
linked ORF in a fashion similar to that found with the naturally linked ORF sequence: As used herein, an "intergenic 
segment" refers to fragments of the Staphylococcus aureus genome which are between two ORF(s) herein described. 
EMFs also can be identified using known EMFs as a target sequence or target motif in the computer-based systems 

25 of the present invention. Further, the two methods can be combined and used together. 

The presence and activity of an EMF can be confirmed using an EMF trap vector. An EMF trap vector contains a 
cloning site linked to a marker sequence. A marker sequence encodes an identifiable phenotype, such as antibiotic 
resistance or a complementing nutrition auxotrophic factor, which can be identified orassayed when the EMF trap 
vector is placed within an appropriate host under appropriate conditions. As described above, a EMF will modulate the 

30 expression of an operably linked marker sequence. A more detailed discussion of various marker sequences is provided 
below. 

A sequence which is suspected as being an EMF is cloned in all three reading frames in one or more restriction 
sites upstream from the marker sequence in the EMF trap vector The vector is then transformed into an appropriate 
host using known procedures and the phenotype of the transformed host in examined under appropriate conditions. 
35 As described above, an EMF will modulate the expression of an operably linked marker sequence. 

As used herein, a "diagnostic fragment," DF, means a series of nucleotide molecules which selectively hybridize 
to Staphylococcus aureus sequences. DFs can be readily identified by identifying unique sequences within contigs of 
the Staphylococcus aureus genome, such as by using well-known computer analysis software, and by generating and 
testing probes or amplification primers consisting of the DF sequence in an appropriate diagnostic format which de- 
40 termines amplification or hybridization selectivity. 

The sequences falling within the scope of the present invention are not limited to the specific sequences herein 
described, but also include allelic and species variations thereof. Allelic and species variations can be routinely deter- 
mined by comparing the sequences provided in SEQ ID NOS: 1 -5, 1 91 , a representative fragment thereof, or a nucleotide 
sequence at least 95%, preferably 99% and most preferably 99.9% identical to SEQ ID NOS:1-5,191 , with a sequence 
45 from another isolate of the same species. 

Furthermore, to accomodate codon variability, the invention includes nucleic acid molecules coding for the same 
amino acid sequences as do the nucleic acid sequences mentioned above. In other words, in the coding region of an 
ORF, substitution of one codon for another which encodes the same amino acid is expressly contemplated. 

Any specific sequence disclosed herein can be readily screened for errors by resequencing a particular fragment, 
so such as an ORR in both directions (i.e., sequence both strands). Alternatively, error screening can be performed by 
sequencing corresponding polynucleotides of Staphylococcus aureus origin isolated by using part or all of the fragments 
in question as a probe or primer. 

Each of the ORFs of the Staphylococcus aureus genome disclosed in Tables 1, 2 and 3, and the EMFs found 5' 
to the ORFs, can be used as polynucleotid reagents in numerous ways. For example, the sequenc s can be used 
55 as diagnostic probes or diagnostic amplification primers to detect the presence of a specific microbe in a sample, 
particular Staphylococcus aureus. Especially preferred in this regard are ORF such as those of Tabl 3, which do not 
match previously characterized sequences from other organisms and thus ar most likely to be highly selective for 
Staphylococcus aureus. Also particularly preferred are ORFs that can be used to distinguish between strains of Sta- 
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phylococcus aureus, particularly those that distinguish medically important strain, such as drug-resistant strains. 

In addition, the fragments of the present invention, as broadly described, can be used to control gene expression 
through triple helix formation or antisense DNA or RNA, both of which methods are based on the binding of a polynu- 
cleotide sequenc to DNA or RNA Triple helix- formation optimally results in a shut-off of RNA transcription from DNA, 
5 while antisense RNA hybridization blocks translation of an mRNA molecule into, polypeptide. Information from the 
sequences of the present invention can be used to design antisense and triple helix-forming oligonucleotides. Polynu- 
cleotides suitable for use in these methods are usually 20 to 40 bases in length and are designed to be complementary 
to a region of the gene involved in transcription, for triple-helix formation, or to the mRNA itself, for antisense inhibition. 
Both techniques have been demonstrated to be effective in model systems, and the requisite techniques are well known 
to and involve routine procedures. Triple helix techniques are discussed in, for example, Lee et ai, Nucl. Acids Res. 6: 
3073 (1979); Cooney et ai, Science 241: 456 (1988); and Dervan era/., Science 251: 1360 (1991). Antisense tech- 
niques in general are discussed in, for instance, Okano, J. Neurochem. 56: 560 (1991) and OLIGODEOXYNUCLE- 
OTIDES AS ANTISENSE INHIBITORS OF GENE EXPRESSION, CRC Press, Boca Raton, FL (1988)). 

The present invention further provides recombinant constructs comprising one or more fragments of the Staphy- 
is lococcus aureus genomic fragments and contigs of the present invention. Certain preferred recombinant constructs of 
the present invention comprise a vector, such as a plasmid or viral vector, into which a fragment of the Staphylococcus 
aureus genome has been inserted, in a forward or reverse orientation. In the case of a vector comprising one of the 
ORFs of the present invention, the vector may further comprise regulatory sequences, including for example, a pro- 
moter, operably linked to the ORF. For vectors comprising the EMFs of the present invention, the vector may further 
20 comprise a marker sequence or heterologous ORF operably linked to the EMF 

Large numbers of suitable vectors and promoters are known to those of skill in the art and are commercially 
available for generating the recombinant constructs of the present invention. The following vectors are provided by 
way of example. Useful bacterial vectors include phagescript, PsiX174, pBluescript SK and KS (+ and -), pNH8a, 
pNH16a, pNH18a, pNH46a (available from Stratagene); pTrc99A, pKK223-3, pKK233-3, pDR540, pRIT5 (available 
25 from Pharmacia). Useful eukaryotic vectors include pWLneo, pSV2cat, pOG44, pXT1 , pSG (available from Stratagene) 
pSVK3, pBPV, pMSG, pSVL (available from Pharmacia). 

Promoter regions can be selected from any desired gene using CAT (chloramphenicol transferase) vectors or other 
vectors with selectable markers. Two appropriate vectors are pKK232-8 and pCM7. Particular named bacterial pro- 
moters include lacl, lacZ, T3, T7, gpt, lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV 
30 thymidine kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein- 1. Selection of the appropriate 
vector and promoter is well within the level of ordinary skill in the art. 

The present invention further provides host cells containing any one of the isolated fragments of the Staphylococcus 
aureus genomic fragments and contigs of the present invention, wherein the fragment has been introduced into the 
host cell using known methods. The host cell can be a higher eukaryotic host cell, such as a mammalian cell, a lower 
35 eukaryotic host cell, such as a yeast cell, or a procaryotic cell, such as a bacterial cell. 

A polynucleotide of the present invention, such as a recombinant construct comprising an ORF of the present 
invention, may be introduced into the host by a variety of well established techniques that are standard in the art, such 
as calcium phosphate transfection, DEAE, dextran mediated transfection and electroporation, which are described in, 
for instance, Davis, L. era/., BASIC METHODS IN MOLECULAR BIOLOGY (1986). 
40 a host cell containing one of the fragments of the Staphylococcus aureus genomic fragments and contigs of the 

present invention, can be used in conventional manners to produce the gene product encoded by the isolated fragment 
(in the case of an ORF) or can be used to produce a heterologous proteiri under the control of the EMF. 

The present invention further provides isolated polypeptides encoded by the nucleic acid fragments of the present 
invention or by degenerate variants of the nucleic acid fragments of the present invention. By "degenerate variant" is 
45 intended nucleotide fragments which differ from a nucleic acid fragment of the present invention (e.g., an ORF) by 
nucleotide sequence but, due to the degeneracy of the Genetic Code, encode an identical polypeptide sequence. 

Preferred nucleic acid fragments of the present invention are the ORFs depicted in Tables 2 and 3 which encode 
proteins. 

A variety of methodologies known in the art can be utilized to obtain any one of the isolated polypeptides or proteins 
so of the present invention. At the simplest level, the amino acid sequence can be synthesized using commercially avail- 
able peptide synthesizers. This is particularly useful in producing small peptides and fragments of larger polypeptides. 
Such short fragments as may be obtained most readily by synthesis are useful, for example, in generating antibodies 
against the native polypeptide, as discussed further below. 

In an alternative method, the polypeptide or protein is purified from bacterial cells which naturally produce the 
55 polypeptide or protein. One skilled in the art can readily employ well-known methods for isolating polpeptides and 
proteins to isolate and purify polypeptides r proteins of the present invention produced naturally by a bacterial strain, 
or by other methods. Methods for isolation and purification that can b employed in this regard include, but are not 
limited to, immunochromatography, HPLC, size-exclusion chromatography, ion-exchange chromatography, and immu- 
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no-affinity chromatography. 

The polypeptides and proteins of the present invention also can be purified from cells which have been altered to 
express the desired polypeptide or protein. As used herein, a cell is said to be altered to express a desired polypeptide 
or protein when the cell, through genetic manipulation, is made to produce a polypeptide or protein which it normally 
5 does not produce or which the cell normally produces at a lower level. Those skilled in the art can readily adapt pro- 
cedures for introducing and expressing either recombinant or synthetic sequences into eukaryotic or prokaryotic cells 
in order to generate a cell which produces one of the polypeptides or proteins of the present invention. 

Any hostA/ector system can be used to express one or more of the ORFs of the present invention. These include, 
but are not limited to, eukaryotic hosts such as HeLa cells, CV-1 cell, COS cells, and Sf9 cells, as well as prokaryotic 
10 host such as E. coli and B. subtilis. The most preferred cells are those which do not normally express the particular 
polypeptide or protein or which expresses the polypeptide or protein at low natural level. 

■Recombinant," as used herein, means that a polypeptide or protein is derived from recombinant {e.g., microbial 
or mammalian) expression systems. "Microbial" refers to recombinant polypeptides or proteins made in bacterial or 
fungal (e.g., yeast) expression systems. As a product, 'recombinant microbiardefines a polypeptide or protein essen- 
*5 tially free of native endogenous substances and unaccompanied by associated native glycosylation. Polypeptides or 
proteins expressed in most bacterial cultures, e.g., E. coli, will be free of glycosylation modifications; polypeptides or 
proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells. 

"Nucleotide sequence" refers to a heteropolymer of deoxyribonucleotides. Generally, DNA segments encoding the 
polypeptides and proteins provided by this invention are assembled from fragments of the Staphylococcus aureus 
20 genome and short oligonucleotide linkers, or from a series of oligonucleotides, to provide a synthetic gene which is 
capable of being expressed in a recombinant transcriptional unit comprising regulatory elements derived from a mi- 
crobial or viral operon. 

"Recombinant expression vehicle or vector" refers to a plasmid or phage or virus or vector, for expressing a polypep- 
tide from a DNA (RNA) sequence. The expression vehicle can comprise a transcriptional unit comprising an assembly 

25 of (1 ) a genetic regulatory elements necessary for gene expression in the host? including elements required to initiate 
and maintain transcription at a level sufficient for suitable expression of the desired polypeptide, including, for example, 
promoters and, where necessary, an enhancers and a polyadenylation signal; (2) a structural or coding sequence 
which is transcribed into mRNA and translated into protein, and (3) appropriate signals to initiate translation at the 
beginning of the desired coding region and terminate translation at its end. Structural units intended for use in yeast 

30 or eukaryotic expression systems preferably include a leader sequence enabling extracellular secretion of translated 
protein by a host cell. Alternatively, where recombinant protein is expressed without a leader or transport sequence, 
it may include an N-terminal methionine residue. This residue may or may not be subsequently cleaved from the 
expressed recombinant protein to provide a final product. 

"Recombinant expression system" means host cells which have stably integrated a recombinant transcriptional 

35 unit into chromosomal DNA or carry the recombinant transcriptional unit extra chromosomally. The cells can be prokary- 
otic or eukaryotic. Recombinant expression systems as defined herein will express heterologous polypeptides or pro- 
teins upon induction of the regulatory elements linked to the DNA segment or synthetic gene to be expressed. 

Mature proteins can be expressed in mammalian cells, yeast, bacteria, or other cells under the control of appro- 
priate promoters. Cell-free translation systems can also be employed to produce such proteins using RN As derived 

to from the DNA constructs of the present invention. Appropriate cloning and expression vectors for use with prokaryotic 
and eukaryotic hosts are described in Sambrook etal., MOLECULAR CLONING: A LABORATORY MANUAL, 2 nd Edi- 
tion, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, New York (1989), the disclosure of which is hereby 
incorporated by reference in its entirety 

Generally, recombinant expression vectors will include origins of replication and selectable markers permitting 

45 transformation of the host cell, e.g., the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 gene, and a 
promoter derived from a highly expressed gene to direct transcription of a downstream structural sequence. Such 
promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphogtycerate kinase (PGK), alpha- 
factor, acid phosphatase, or heat shock proteins, among others. The heterologous structural sequence is assembled 
in appropriate phase with translation initiation and termination sequences, and preferably, a leader sequence capable 

so of directing secretion of translated protein into the periplasmic space or extracellular medium. Optionally, the heterol- 
ogous sequence can encode a fusion protein including an N-terminal identification peptide imparting desired charac- 
teristics, e.g., stabilization or simplified purification of expressed recombinant product. 

Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a 
desired protein together with suitable translation initiation and termination signals in operable reading phase with a 

S5 functional promoter. The vector will comprise one or more phenotypic selectable markers and an origin of replication 
to ensur maintenanc of the vector and, when desirable, provide amplification within th host. 

Suitable prokaryotic hosts for transformation include strains of Staphylococcus aureus, E. coli, B. subtilis, Salmo- 
nella typhimurium and various species within the genera Pseudomonas, Streptomyces, and Staphylococcus. Others 
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may, also be employed as a matter of choice. 

As a representative but non-limiting example, useful expression vectors for bacterial us can comprise a selectable 
marker and bacterial origin of replication derived from commercially available plasmids comprising genetic elements 
of the well known cloning vector pBR322 (ATCC 3701 7). Such commercial vectors include, for example, pKK223-3 
5 (available form Pharmacia Fine Chemicals, Uppsala, Sweden) and GEM 1 (available from Promega Biotec, Madison, 
Wl, USA). These pBR322 'backbone" sections are combined with an appropriate promoter and the structural sequence 
to be expressed. 

Following transformation of a suitable host strain and growth of the host strain to an appropriate cell density, the 
selected promoter, where it is inducible, is derepressed or induced by appropriate means (e.g., temperature shift or 
10 chemical induction) and cells are cultured for an additional period to provide for expression of the induced gene product. 
Thereafter cells are typically harvested, generally by centrifugatton, disrupted to release expressed protein, generally 
by physical or chemical means, and the resulting crude extract is retained for further purification. 

Various mammalian cell culture systems can also be employed to express recombinant protein. Examples of mam- 
malian expression systems include the COS-7 lines of monkey kidney fibroblasts, described in Gluzman, Cell 23: 175 
'5 (1981), and other cell lines capable of expressing a compatible vector, for example, the C127, 3T3, CHO, HeLa and 
BHK cell lines. 

Mammalian expression vectors will comprise an origin of replication, a suitable promoter and enhancer, and also 
any necessary ribosome binding sites, polyadenylation site, splice donor and acceptor sites, transcriptional termination 
sequences, and 5' flanking nontranscribed sequences. DNA sequences derived from the SV40 viral genome, for ex- 

20 ample, SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may be used to provide the required 
nontranscribed genetic elements. 

Recombinant polypeptides and proteins produced in bacterial culture is usually isolated by initial extraction from 
cell pellets, followed by one or more salting-out, aqueous ion exchange or size exclusion chromatography steps. Mi- 
crobial cells employed in expression of proteins can be disrupted by any convenient method, including freeze-thaw 

2S cycling, sonication, mechanical disruption, or use of cell lysing agents. Protein refolding steps can be used, as neces- 
sary, in completing configuration of the mature protein. Finally, high performance liquid chromatography (HPLC) can 
be employed for final purification steps. 

An additional aspect of the invention includes Staphylococcus aureus polypeptides which are useful as immuno- 
diagnostic antigens and/or immunoprotective vaccines, collectively ■immunologically useful polypeptides". Such im- 

30 munologicalry useful polypeptides may be selected from the ORFs disclosed herein based on techniques well known 
in the art and described elsewhere herein. The inventors have used the following criteria to select several immunolog- 
ically useful polypeptides: 

As is known in the art, an amino terminal type I signal sequence directs a nascent protein across the plasma and 
outer membranes to the exterior of the bacterial cell. Such outermembrane polypeptides are expected to be immuno- 

35 logically useful. According to Izard, J. W. et al.. Mol. Microbiol. 13, 765-773; (1994), polypeptides containing type I 
signal sequences contain the following physical attributes: The length of the type I signal sequence is approximately 
15 to 25 primarily hydrophobic amino acid residues with a net positive charge in the extreme amino terminus; the 
central region of the signal sequence must adopt an alpha-helical conformation in a hydrophobic environment; and the 
region surrounding the actual site of cleavage is ideally six residues long, with small side-chain amino acids in the -1 

40 and -3 positions. 

Also known in the art is the type IV signal sequence which is an example of the several types of functional signal 
sequences which exist in addition to the type I signal sequence detailed above. Although functionally related, the type 
IV signal sequence possesses a unique set of biochemical and physical attributes (Strom, M. S. and Lory, S., J. Bac- 
teriol. 174, 7345-7351; 1992)). These are typically six to eight amino acids with a net basic charge followed by an 

45 additional sixteen to thirty primarily hydrophobic residues. The cleavage site of a type IV signal sequence is typically 
after the initial six to eight amino acids at the extreme amino terminus. In addition, all type IV signal sequences contain 
a phenylalanine residue at the +1 site relative to the cleavage site. 

Studies of the cleavage sites of twenty-six bacterial lipoprotein precursors has allowed the definition of a consensus 
amino acid sequence for lipoprotein cleavage. Nearly three-fourths of the bacterial lipoprotein precursors examined 

50 contained the sequence L-(A,S)-(G,A)-C at positions -3 to +1, relative to the point of cleavage (Hayashi, S. and Wu, 
H. C. Lipoproteins in bacteria. J Bioenerg. Biomembr. 22, 451-471; 1990). 

It well known that most anchored proteins found on the surface of gram-positive bacteria possess a highly con- 
served carboxy terminal sequence. More than fifty such proteins from organisms such as S. pyogenes, S. mutans, E. 
faecalis, S. pneumoniae, and others, have been identified based on their extracellular location and carboxy terminal 

55 amino acid sequence (Fischetti, V. A. Gram-positiv commensal bacteria deliver antigens to elicit mucosal and systemic 
immunity. ASM News 62, 40541 0; 1 996). Th conserved region is comprised of six charged amino acids at the extreme 
carboxy terminus coupled to 15-20 hydrophobic amino acids presumed to function as a transmembrane domain. Im- 
mediately adjacent to the transmembrane domain is a six amino acid sequence conserved in nearly all proteins ex- 
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amined. The amino acid sequence of this region is L-P-X-T-G-X, where X is any amino acid. 

Amino acid sequence similariti s to proteins of known function by BLAST enables the assignment of putative 
functions to novel amino acid sequences and allows for the selection of proteins thought to function outside the cell 
wall. Such proteins are well known in the art and include "lipoprotein", "periplasmic", or "antigen". 

s An algorithm for selecting antigenic and immunogenic Staphylococcus aureus polypeptides including the foregoing 

criteria was developed by th present inventors. Use of the algorithm by the inventors to select immunologically useful 
Staphylococcus aureus polypeptides resulted in the selection of several ORFs which are predicted to be outermem- 
brane-associated proteins. These proteins are identified in Table 4, below, and shown in the Sequence Listing as SEQ 
I D NOS;5, 1 92 to 5,255. Thus the amino acid sequence of each of several anti gen ^Staphylococcus aureus polypeptides 

10 listed in Table 4 can be determined, for example, by locating the amino acid sequence of the ORF in the Sequence 
Listing. Likewise the polynucleotide sequence encoding each ORF can be found by locating the corresponding poly- 
nucleotide SEQ ID in Tables 1, 2, or 3, and finding the corresponding nucleotide sequence in the sequence listing. 

As will be appreciated by those of ordinary skill in the art, although a polypeptide representing an entire ORF may 
be the closest approximation to a protein found in vivo, it is not always technically practical to express a complete ORF 

*5 in vitro. It may be very challenging to express and purify a highly hydrophobic protein by common laboratory methods. 
As a result, the immunologically useful polypeptides described herein as SEQ ID NOS:5, 192-5,255 may have been 
modified slightly to simplify the production of recombinant protein, and are the preferred embodiments. In general, 
nucleotide sequences which encode highly hydrophobic domains, such as those found at the amino terminal signal 
sequence, are excluded for enhanced in vitro expression of the polypeptides. Furthermore, any highly hydrophobic 

20 amino acid sequences occurring at the carboxy terminus are also excluded. Such truncated polypeptides include for 
example the mature forms of the polypeptides expected to exist in nature. 

Those of ordinary skill in the art can identify soluble portions the polypeptide identified in Table 4, and in the case 
of truncated polypeptides sequences shown as SEQ ID NOS:5, 192-5,255, may obtain the complete predicted amino 
acid sequence of each polypeptide by translating the corresponding polynucleotides sequences of the corresponding 

2S ORF listed in Tables 1,2 and 3 and found in the sequence listing. 

Accordingly, polypeptides comprising the complete amino acid of an immunologically useful polypeptide selected 
from the group of polypeptides encoded by the ORFs identified in Table 4, or an amino acid sequence at least 95% 
identical thereto, preferably at least 97% identical thereto, and most preferably at least 99% identical thereto form an 
embodiment of the invention; in addition polypeptides comprising an amino acid sequence selected from the group of 

30 amino acid sequences shown in the sequence listing as SEQ ID NOS:5, 191 -5,255, or an amino acid sequence at least 
95% identical thereto, preferably at least 97% identical thereto and most preferably at least 99% identical thereto, form 
an embodiment of the invention. Polynucleotides encoding the foregoing polypeptides also form part of the present 
invention. 

In another aspect, the invention provides a peptide or polypeptide comprising an epitope-bearing portion of a 

35 polypeptide of the invention, particularly those epitope-bearing portions (antigenic regions) identified in Table 4. The 
epitope-bearing portion is an immunogenic or antigenic epitope of a polypeptide of the invention. An "immunogenic 
epitope" is defined as a part of a protein that elicits an antibody response when the whole protein is the immunogen. 
On the other hand, a region of a protein molecule to which an antibody can bind is defined as an "antigenic epitope." 
The number of immunogenic epitopes of a protein generally is less than the number of antigenic epitopes. See, for 

40 instance, Geysen et at., Proc. Natl. Acad. Sci. USA 81:3998- 4002 (1983). 

As to the selection of peptides or polypeptides bearing an antigenic epitope (i.e., that contain a region of a protein 
molecule to which an antibody can bind), it is well known in that art that relatively short synthetic peptides that mimic 
part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. 
See, for instance, Sutcliffe, J. G., Shinnick, T M., Green, N. and Learner, R. A. (1983) "Antibodies that react with 

45 predetermined sites on proteins", Science, 219:660-666. Peptides capable of eliciting protein-reactive sera are fre- 
quently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and 
are confined neither to immunodominant regions of intact proteins (i.e., immunogenic epitopes) nor to the amino or 
carboxyl terminals. Antigenic epitope-bearing peptides and polypeptides of the invention are therefore useful to raise 
antibodies, including monoclonal antibodies, that bind specifically to a polypeptide of the invention. See, for instance, 

so Wilson et al., Cell 37:767-778 (1984) at 777. 

Antigenic epitope-bearing peptides and polypeptides of the invention preferably contain a sequence of at least 
seven, more preferably at least nine and most preferably between about 15 to about 30 amino acids contained within 
the amino acid sequence of a polypeptide of the invention Non-limiting examples of antigenic polypeptides or peptides 
that can be used to generate S. aureus specific antibodies include: a polypeptide comprising peptides shown in Table 

55 4 below. Thes polypeptide fragments have b en determined to bear antigenic epitopes of indicated S. aureus proteins 
by the analysis of the Jameson-Wolf antigenic index, a representative sample of which is shown in Figure 3. 

The pitope-b aring peptides and polyp ptides of the invention may b produced by any conventional means. 
See, e.g., Hought n, R. A. (1985) G neral method for the rapid solid-phase synthesis of large numbers of peptides: 
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specificity of antigen-antibody interaction at the level of individual amino acids. Proc. Natl. Acad. Sci. USA 82: 
5131-5135; this "Simultaneous Multiple Peptide Synthesis (SMPS)' process is further described in U.S. Patent No. 
4,631 ,211 to Houghten et al. (1986). Epitope-bearing peptides and polypeptides of the invention are used to induce 
antibodies according to methods well known in the art. See, for instance, Sutcliffe et al., supra; Wilson et al., supra; 
Chow, M. et aL. Proc. NatL Acad. Sci. USA 82:910-914; and Bittle, F. J. et al.. J. Gen. Virol. 66:2347-2354 (1985). 

Immunogenic epitope-bearing peptides of the invention, i.e. , those parts of a protein that elicit an antibody response 
when the whole protein is the immunogen, are identified according to methods known in the art. See, for instance, 
Geysen etal., supra. Further still, U.S. Patent No. 5, 194,392 to Gey sen (1990) describes a general method of detecting 
or determining the sequence of monomers (amino acids or other compounds) which is a topological equivalent of the 
epitope (i.e., a "mimotope") which is complementary to a particular paratope (antigen binding site) of an antibody of 
interest. More generally, U.S. Patent No. 4,433,092 to Geysen (1989) describes a method of detecting or determining 
a sequence of monomers which is a topographical equivalent of a ligand which is complementary to the ligand binding 
site of a particular receptor of interest. Similarly, U.S. Patent No. 5,480,971 to Houghten, R. A. et al. (1996) on Per- 
alkylated Oligopeptide Mixtures discloses linear C1 -C7-alkyl peralkylated oligopeptides and sets and libraries of such 
is peptides, as well as methods for using such oligopeptide sets and libraries for determining the sequence of a per- 
alkylated oligopeptide that preferentially binds to an acceptor molecule of interest. Thus, non-peptide analogs of the 
epitope-bearing peptides of the invention also can be made routinely by these methods. 

Table 4 lists immunologically useful polypeptides identified by an algorithm which locates novel Staphylococcus 
aureus outermembrane proteins, as is described above. Also listed are epitopes or "antigenic regions" of each of the 
20 identified polypeptides. The antigenic regions, or epitopes, are delineated by two numbers x-y where x is the number 
of the first amino.acid in the open reading frame included within the epitope and y is the number of the last amino acid 
in the open reading frame included within the epitope. For example, the first epitope in ORF 168-6 is comprised of 
amino acids 36 to 45 of SEQ ID NO:5,192, as is described in Table 4. The inventors have identified several epitopes 
for each of the antigenic polypeptides identified in Table 4. Accordingly, forming part of the present invention are 
25 polypeptides comprising an amino acid sequence of one or more antigenic regions identified in Table 4. The invention 
further provides polynucleotides encoding such polypeptides. 

The present invention further includes isolated polypeptides, proteins and nucleic acid molecules which are sub- 
stantially equivalent to those herein described. As used herein, substantially equivalent can refer both to nucleic acid 
and amino acid sequences, for example a mutant sequence, that varies from a reference sequence by one or more 
30 substitutions, deletions, or additions, the net effect of which does not result in an adverse functional dissimilarity be- 
tween reference and subject sequences. For purposes of the present invention, sequences having equivalent biological 
activity, and equivalent expression characteristics are considered substantially equivalent. For purposes of determining 
equivalence, truncation of the mature sequence should be disregarded. 

The invention further provides methods of obtaining homologs from other strains of Staphylococcus aureus, of the 
35 fragments of the Staphylococcus aureus genome of the present invention and homologs of the proteins encoded by 
the ORFs of the present invention. As used herein, a sequence or protein of Staphylococcus aureus is defined as'a 
homolog of a fragment of the Staphylococcus aureus fragments or contigs or a protein encoded by one of the ORFs 
of the present invention, if it shares significant homology to one of the fragments of the Staphylococcus aureus genome 
of the present invention or a protein encoded by one of the ORFs of the present invention. Specifically, by using the 
sequence disclosed herein as a probe or as primers, and techniques such as PCR cloning and colony/|plaque hybrid- 
ization, one skilled in the art can obtain homologs. 

As used herein, two nucleic acid molecules or proteins are said to "share significant homology" if the two contain 
regions which prossess greater than 85% sequence (amino acid or nucleic acid) homology. Preferred homologs in this 
regard are those with more than 90% homology. Especially preferred are those with 93% or more homology. Among 
45 especially preferred homologs those with 95% or more homology are particularly preferred. Very particularly preferred 
among these are those with 97% and even more particularly preferred among those are homologs with 99% or more 
homology. The most preferred homologs among these are those with 99.9% homology or more. It will be understood 
that, among measures of homology, identity is particularly preferred in this regard. 

Region specific primers or probes derived from the nucleotide sequence provided in SEQ ID NOS:1 -5, 1 9 1 or from 
so a nucleotide sequence at least 95%, particularly at least 99%, especially at least 99.5% identical to a sequence ol SEO 
ID NOS;1-5,191 can be used to prime DNA synthesis and PCR amplification, as well as to identify colonies contanmg 
cloned DNA encoding a homolog. Methods suitable to this aspect of the present invention are well known and have 
been described in great detail in many publications such as. for example, Innis etal, PCR PROTOCOLS, Academic 
Press, San Diego, CA (1990)). 

55 When usingprimersderivedfrom SEQ IDN0S:1-5,191 or from a nucleotkJ sequenc having an aforementioned 

identity to a sequence of SEQ ID NOS:1 -5, 1 91 , one skilled in the art will recogniz that by employing high stnngency 
conditions (e.g., annealing at 50-60°C in 6X SSPC and 50% formamid . and washing at 50- 65°C in 0.5X SSPC) onty 
sequences which are greater than 75% homologous to the primer will be amplified. By employing lower stringency 
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s^^T ' hybridizin 9 at 35 " 37 ° C « 5X SSPC and 40-45% formamide, and washing at 42»C In 0 5X SSPC> 
sequences wh,ch are greater than 40-50% homologous to the primer will also be amplified 

tion^i" t* 7 Pr ° beS deriV8d ,r0m SE ° ID NOS 1 - 5 - 1 91 ■ °r from a nucleotide sequence having an aforemen- 
t.oned .dentity to a sequence of SEQ ID NOS1 -5,191 . for colony/plaque hybridization, one skilled in the art wS ™ 

ssir^rgss ofx ssp 9 c? c s y conditions h re °- hybrk,izin9 at ^ 6sec in sx sspc - s^^rs 

washing at 50- 65 C in 0.5X SSPC), sequences having regions which are greater than 90% homologous to the orobe 
«n b. ott^ed and that by employing lower stringency conditions (e.g., hybrfclizing at 35-37'cTsx SSPC and 
40-45% formam,de, and wash.ng at 42'C in 0.5X SSPC), sequences having regions which are greater than 35 
homologous to the probe will be obtained. greater than 35-45 /<> 

w Any organism can be used as the source for homologs of the present invention so long as the organism naturally 

SETrtT? ? ntainS 96neS enCOdin9 thS S3me The mOSt P referred *» is^g homologs 

are bactenas which are closely related to Staphylococcus aureus. "omoiogs 

ILLUSTRATIVE USES OF COMPOSITIONS OF THE INVENTION 

A, ^1?T Pr ^! d ! d in u Tab ' eS 1 and 2 iS id8n,ified With 3 ,Unction b * homology to a known gene or polypeptide 
Zu^li m ^ USS ,hS P°'yP e P tides of th « P^ent invention for commercial, theSZfcc and 

industrial purposes consistent with the type of putative identification of the polypeptide. Such identification^permS one 

20 Z T ,0 ^ ? e f aphy ' OCOCCUS aureus ORFs in a -n-nnar similar to the known type of aaqJSLKJ^J 

20 the identification ,s made; for example, to ferment a particular sugar source or to produce a particular metabolite A 
variety of reviews il.ustrat^e of this aspect of the invention are availab.e. including xti .Wk^nU^Z^iSriJ 
use of enzymes, for example, BIOCHEMICAL ENGINEERING AND BIOTECHNOLOGY HANDBOOK 2nri Prt M^ 
mi.ten Publjations, Ltd. NY (1991) and BIOCATALYSTS IN ORGANIC SYNTHESES Tramped' Eds Eisner' 

2S aSS o^rl S en? Sterdam ' Th * Nethe " andS ° 985) A ^ * «" that Ltrat^Sand a 

*° aspects oi the present invention are discussed below. 

1. Biosynthetic Enzymes 

Open reading frames encoding proteins involved in mediating the catalytic reactions involved in intermediary and 
*o macromo.ecu.ar me.abo.ism, the biosynthesis of sma.. molecules, cellular processes and otherlunct ons "dudes en 

ZoZ^p IZ^:^ reSP ' rati0n ' b ° ,h aer ° biC and anaer ° biC - 6nZymeS ™ 0hfed in 'e-entation, enzyme^ 
lm?nn £L P conversion, enzymes involved in broad regulatory function, enzymes involved in 

* r^rszsr in nuc,eot * e synthesis - enzymes invo,ved in co,actor and vLmin ~' 

The various metabolic pathways present in Staphylococcus aureus can be identified based on absolute nutritional 
requirements as well as by examining the various enzymes identified in Table 1 -3 and SEQ ID NOS 1 9? 

Of particular interest are polypeptides involved in the degradation of intermediary metabolites as well as non- 
macromolecular metabolism. Such enzymes include amylases, glucose oxidases, and catalase 

Proteolytic enzymes are another class of commercially important enzymes. Proteolytic enzymes find use in a 

and ZtT T' SS6S inC,Udin9 ,hS PfOCeSSin9 °' f ' aX and ° ,her VeQ ° ,abte e«on clhfica^ 

and depec imza »n of fruit juices, in the extraction of vegetables' oil and in the maceration of fruits and vegetables to 
give unicellular fruit* A detailed review of the proteolytic enzymes used in the food industry is provided in Rombouts 
eta, Symb^JI . 79 (1986) and Voragen etal. in BIOCATALYSTS IN AGRICULTURAL BIOTECHNOLOGY Whrtak 
er et a,., Eds., American Chemical Society Symposium Series 389 93 (1 989) 

invnl^ h 1 ? U9ar ? iS an imp0rtant aSpeCt °' ,he Prima,y me,a bo«sm of Staphylococcus aureus. Enzymes 

ndustSl Imt^T T SUCh 3S ' particularl * 9 lucos * Sata^e. fructose and xylose, can be used in 

industry fermentation Some of the important sugar transforming enzymes, from a commercial viewpoint, include 

so l^llTZ^^rT^? 05 ? *°™rase. Other metabolic enzymes have found commercial use such as glucose 
ac S^SLTlT R P °^ U . CeS . ke, ° 9U,0n,C aCid (KGA > KG A is an intermediate in the commercial production of ascorbic 
P^inheim 6 ^^ " * ^ " *' ' B ^^^ "nine et a,.. Eds , Verlag 

formlir^T^ 86 (G . O0) 1 <* xnmercia "y available and ha * been used in purified form as well as in an immobilized 

* a deoxv 9enat»n of beer. See. for instance. Hartmeir et a,., Biotech notogy Letters 1: 21 (1979). The most 

S i 1^1 ,° ^ f *? indUS,ria ' 80316 ,ermen,a,ion of iconic acid. Market for gluconic acids which are 
for^r m n te p 9 ' f 1 ' ^ ther - P hot °9 ra P hic - Pharmaceutical, food, feed and concret industry, as described, 
tor example, in Big lis et al., beg.nn.ng on page 357 .n GENE MANIPULATIONS AND FUNGI; Benett et al Eds 
Academ, c Press. New York (1 985). In addition to industrial applications. GOD has found applications in medicine for 
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quantitative determination of glucose in body fluids recently in biotechnology for analyzing syrups from starch and 
cellulose hydrosylates. This application is described in Owusu era/., Biochem. et Biophysica. Acta. 872: 83 (1986), for 
instance. 

The main sweetener used in the world today is sugar which comes from sugar beets and sugar cane. In the field 
s of industrial enzymes, the glucose isomerase process shows the largest expansion in the market today. Initially, soluble 
enzymes were used and later immobilized enzymes were developed (Krueger era/., Biotechnology, The Textbook of 
Industrial Microbiology, Sinauer Associated Incorporated, Sunderland, Massachusetts (1990)). Today, the use of glu- 
cose- produced high fructose syrups is by far the largest industrial business using immobilized enzymes. A review of 
the industrial use of these enzymes is provided by Jorgensen, Starch 40:307 (1988). 
10 Proteinases, such as alkaline serine proteinases, are used as detergent additives and thus represent one of the 

largest volumes of microbial enzymes used in the industrial sector. Because of their industrial importance, there is a 
large body of published and unpublished information regarding the use of these enzymes in industrial processes. (See 
Faultman era/., Acid Proteases Structure Function and Biology, Tang, J., ed., Plenum Press, New York (1977) and 
Godfrey et al., Industrial Enzymes, MacMillan Publishers, Surrey, UK (1983) and Hepner et ai, Report Industrial En- 
's zymes by 1990, Hel Hepner & Associates, London (1986)). 

Another class of commercially usable proteins of the present invention are the microbial lipases, described by, for 
instance, Macrae etal., Philosophical Transactions of the Chiral Society of London 310:227 (1 985) and Poserke, Jour- 
nal of the American Oil Chemist Society 61:1758 (1984). A major use of lipases is in the fat and oil industry for the 
production of neutral glycerides using lipase catalyzed inter-esterification of readily available triglycerides. Application 
20 of lipases include the use as a detergent additive to facilitate the removal of fats from fabrics in the course of the 
washing procedures. 

The use of enzymes, and in particular microbial enzymes, as catalyst for key steps in the synthesis of complex 
organic molecules is gaining popularity at a great rate. One area of great interest is the preparation of chiral interme- 
diates. Preparation of chiral intermediates is of interest to a wide range of synthetic chemists particularly those scientists 

2S involved with the preparation of new pharmaceuticals, agrochemicals, fragrances and flavors. (See Davies et al., Re- 
cent Advances in the Generation of Chiral Intermediates Using Enzymes, CRC Press, Boca Raton, Florida (1990)). 
The following reactions catalyzed by enzymes are of interest to organic chemists:hydrolysis of carboxylic acid esters, 
phosphate esters, amides and nitrites, esterification reactions, trans-esterification reactions, synthesis of amides, re- 
duction of alkanones and oxoalkanates, oxidation of alcohols to carbonyl compounds, oxidation of sulfides to sulfoxides, 

^0 and carbon bond forming reactions such as the aldol reaction. 

When considering the use of an enzyme encoded by one of the ORFs of the present invention for biotransformation 
and organic synthesis it is sometimes necessary to consider the respective advantages and disadvantages of using a 
microorganism as opposed to an isolated enzyme. Pros and cons of using a whole cell system on the one hand or an 
isolated partially purified enzyme on the other hand, has been described in detail by Bud et al, Chemistry in Britain 

55 (1987), p. 127. 

Amino transferases, enzymes involved in the biosynthesis and metabolism of amino acids, are useful in the catalytic 
production of amino acids. The advantages of using microbial based enzyme systems is that the amino transferase 
enzymes catalyze the stereo- selective synthesis of only L-amino acids and generally possess uniformly high catalytic 
rates. A description of the use of amino transferases for amino acid production is provided by Rose lie-David, Methods 
•*o of Enzymology 136:479 (1 987). 

Another category of useful proteins encoded by the ORFs of the present invention include enzymes involved in 
nucleic acid synthesis, repair, and recombination. A variety of commercially important enzymes have previously been 
isolated from members of Staphylococcus aureus. These include Sau3A and Sau96l. 

•*s 2. Generation of Antibodies 



As described here, the proteins of the present invention, as well as homotogs thereof, can be used in a variety 
procedures and methods known in the art which are currently applied to other proteins. The proteins of the present 
invention can further be used to generate an antibody which selectively binds the protein. Such antibodies can be 
either monoclonal or polyclonal antibodies, as well fragments of these antibodies, and humanized forms. 

The invention further provides antibodies which selectively bind to one of the proteins of the present invention and 
hybridomas which produce these antibodies. A hybridoma is an immortalized cell line which is capable of secreting a 
specific monoclonal antibody. 

In general, techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capabl of pro- 
ducing the desired antibody ar w II known in the art (Campbell, A. M., MONOCLONAL ANTIBODY TECHNOLOGY: 
LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY, Els vier Science Publishers. Am- 
sterdam, The Netherlands (1984); St. Groth et ai, J. Immunol. Methods 35: 1-21 (1980), Kohler and Milstein, Nature 
256: 495-497 (1 975)), the trioma technique, the human B- cell hybridoma techniqu (Kozbor et ai, Immunology Today 
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4: 72 (1983), pgs. 77-96 of Cole etai, in MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc 
(1985)). 

Any animal (mouse, rabbit, etc) which is known to produce antibodies can be immunized with the pseudogene 
polypeptide. Methods for immunization are well known in the art. Such methods include subcutaneous or interperitoneal 
injection of the polypeptide. One skilled in the art will recognize that the amount of the protein encoded by the ORF of 
the present invention used for immunization will vary based on the animal which is immunized, the antigenicity of the 
peptide and the site of injection. 

The protein which is used as an immunogen may be modified or administered in an adjuvant in order tcf increase 
the protein's antigenicity. Methods of increasing the antigenicity of a protein are well known in the art and include but 
are not limited to coupling the antigen with a heterologous protein (such as globulin or galactosidase) or through the 
inclusion of an adjuvant during immunization. 

For monoclonal antibodies, spleen cells from the immunized animals are removed, fused with myeloma cells, such 
as SP2/0-Ag14 myeloma cells, and allowed to become monoclonal antibody producing hybridoma cells. 

Any one of a number of methods well known in the art can be used to identify the hybridoma cell which produces 
an antibody with the desired characteristics. These include screening the hybridomas with an ELISA assay, western 
blot analysis, or radioimmunoassay (Lutz et ai, Exp. Cell Res. US: 109-124 (1 988)). 

Hybridomas secreting the desired antibodies are cloned and the class and subclass is determined using procedures 
known in the art (Campbell, A. M.. Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Mo- 
lecular Biology, Elsevier Science Publishers, Amsterdam, The Netherlands (1984)). 

Techniques described for the production of single chain antibodies (U. S. Patent 4,946,778) can be adapted to 
produce single chain antibodies to proteins of the present invention. 

For polyclonal antibodies, antibody containing antisera is isolated from the immunized animal and is screened for 
the presence of antibodies with the desired specificity using one of the above<iescribed procedures. 

The present invention further provides the above- described antibodies in detectably labelled form Antibodies can 
be detectably labelled through the use of radioisotopes, affinity labels (such as.biotin, avidin, etc I enzymatic labels 
(such as horseradish peroxidase, alkaline phosphatase, etc.) fluorescent labels (such as FITC or rhodamine etc) 
paramagnetic atoms, etc. Procedures for accomplishing such labelling are well-known in the art for example see 
Sternberger era/., J. Histochem. Cytochem. 18:315 (1970); Bayer, E. A. era/., Meth. Enzym. 62:308 (1979) Enqval 
E. era/., Immunol. 109:129 (1972); Goding, J. W. J. Immunol. Meth. 13:215 (1976)). ' ' 

The labeled antibodies of the present invention can be used for in vitro, in vivo, and in situ assays to identify cells 
or tissues in which a fragment of the Staphylococcus aureus genome is expressed. 

The present invention further provides the above<Jescribed antibodies immobilized on a solid support Examples 
of such solid supports include plastics such as polycarbonate, complex carbohydrates such as agarose and sepharose 
acrylic resins and such as polyacrylamide and latex beads. Techniques for coupling antibodies to such solid supports 
are well known in the art (Weir, D. M. er ai, -Handbook of Experimental Immunology- 4th Ed., Blackwell Scientific 
Publications, Oxford, England, Chapter 10 (1986); Jacoby, W. D. etai, Meth. Enzym. 34 Academic Press, N. Y (1974)) 
The immobilized antibodies of the present invention can be used for in vitro, in vivo, and in situ assays as well as for 
immunoaffinity purification of the proteins of the present invention. 

*o 3. Diagnostic Assays and Kits 

The present invention further provides methods to identify the expression of one of the ORFs of the present in- 
vention, or homolog thereof, in a test sample, using one of the DFs,antigens or antibodies of the present invention 

In detail, such methods comprise incubating a test sample with one or more of the antibodies, or one or more of 
the DFs, or one or more antigens of the present invention and assaying for binding of the DFs, antigens or antibodies 
to components within the test sample. 

Conditions for incubating a DF, antigen or antibody with a test sample vary. Incubation conditions depend on the 
format employed in the assay, the detection methods employed, and the type and nature of the DF or antibody used 
in the assay. One skilled in the art will recognize that any one of the commonly available hybridization, amplification 
or immunological assay formats can readily be adapted to employ the Dfs, antigens or antibodies of the present m- 
vention. Examples of such assays can be found in Chard, T, An Introduction to Radioimmunoassay and Related 
Techniques, Elsevier Science Publishers, Amsterdam, The Netherlands (1986); Bullock, G. R. etai, Techniques n 
Immunocytochemistry, Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985); Tijssen. P. Practce 
and Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry; PCT publication W095/32291 and 
Molecular Biology, Elsevi r Science Publishers, Amsterdam, The Netherlands (1 985), all of which are hereby incorpo- 
rated herein by ref rence. 

The test samples of the present invention include cells, protein or membrane extracts of cells, or biotog»cai flu«te 
such as sputum, blood, s rum, plasma, or urine. The test sample used in the above-described method will vary based 
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on the assay format, nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
Methods for preparing protein extracts or membrane extracts of cells are well known in the art and can be readily be 
adapted in order to obtain a sample which is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the necessary reagents to carry 

5 out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close confinement, one or more containers 
which comprises: (a) a first container comprising one of the Dfs, antigens or antibodies of the present invention; and 
(b) one or more other containers comprising one or more of the following: wash reagents, reagents capable of detecting 
presence of a bound DF, antigen or antibody. 

10 in detail, a compartmentalized kit includes any kit in which reagents are contained in separate containers. Such 

containers include small glass containers, plastic containers or strips of plastic or paper Such containers allows one 
to efficiently transfer reagents from one compartment to another compartment such that the samples and reagents are 
not cross-contaminated, and the agents or solutions of each container can be added in a quantitative fashion from one 
compartment to another. Such containers will include a container which will accept the test sample, a container which 

* 5 contains the antibodies used in the assay, containers which contain wash reagents (such as phosphate buffered saline, 
Tris-buffors, etc.), and containers which contain the reagents used to detect the bound antibody, antigen or DF. 

Types of detection reagents include labelled nucleic acid probes, labelled secondary antibodies, or in the alterna- 
tive, if tho primary antibody is labelled, the enzymatic, or antibody binding reagents which are capable of reacting with 
the labelled antibody. One skilled in the art will readily recognize that the disclosed Dfs, antigens and antibodies of the 

20 present invention can be readily incorporated into one of the established kit formats which are well known in the art. 

4. Screening Assay for Binding Agents 

Using the isolated proteins of the present invention, the present invention further provides methods of obtaining 
2S and identifying agents which bind to a protein encoded by one of the ORFs of the present invention or to one of the 
fragments ^nd the Staphylococcus aureus fragment and contigs herein described. 
In general, such methods comprise steps of: 

(a) contacting an agent with an isolated protein encoded by one of the ORFs of the present invention, or an isolated 
30 fragment of the Staphylococcus aureus genome; and 

(b) determining whether the agent binds to said protein or said fragment. 

The agents screened in the above assay can be, but are not limited to, peptides, carbohydrates, vitamin derivatives, 
or other pharmaceutical agents. The agents can be selected and screened at random or rationally selected or designed 
35 using protein modeling techniques. 

For random screening, agents such as peptides, carbohydrates, pharmaceutical agents and the like are selected 
at random and are assayed for their ability to bind to the protein encoded by the ORF of the present invention. 

Alternatively, agents may be rationally selected or designed. As used herein, an agent is said to be "rationally 
selected or designed' when the agent is chosen based on the configuration of the particular protein. For example, one 
40 skilled in the art can readily adapt currently available procedures to generate peptides, pharmaceutical agents and the 
like capable of binding to a specific peptide sequence in order to generate rationally designed antipeptide peptides, 
for example see Hurby era/., Application of Synthetic Peptides: Antisense Peptides," In Synthetic Peptides, A User's 
Guide, W. H. Freeman, NY (1 992), pp. 289-307, and Kaspczak etai t Biochemistry 28:9230-8 (1 989), or pharmaceutical 
agents, or the like. 

45 in addition to the foregoing, one class of agents of the present invention, as broadly described, can be used to 

control gene expression through binding to one of the ORFs or EMFs of the present invention. As described above, 
such agents can be randomly screened or rationally designed/selected. Targeting the ORF or EMF allows a skilled 
artisan to design sequence specific or element specific agents, modulating the expression of either a single ORF or 
multiple ORFs which rely on the same EMF for expression control. 

so One class of DNA binding agents are agents which contain base residues which hybridize or form a triple helix by 

binding to DNA or RNA. Such agents can be based on the classic phosphodiester, ribonucleic acid backbone, or can 
be a variety of sutfhydryl or polymeric derivatives which have base attachment capacity. 

Agents suitable for use in these methods usually contain 20 to 40 bases and are designed to be complementary 
to a region of the gene involved in transcription (triple helix - see Lee era/., Nucl. Acids Res. 6:3073 (1979); Cooney 

55 etaL, Science 241:456 (1988); and Dervan era/., Science 251: 1360 (1991)) or to the mRNA itself (antisense • Okano, 
J. Neurochem. 56:560 (1991); Oligodeoxy nucleotides as Antisense Inhibitors of Q ne Expr ssion, CRC Press, Boca 
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of RNA transcription from DNA, while antisens 
RNA hybridization blocks translation of an mRNA molecule into polypeptide. Both techniques have been demonstrated 
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to be effective in model systems. Information contained in the sequences of the present invention can boused to design 
antisense and triple helix-forming oligonucleotides, and other DNA binding agents. 

5. Pharmaceutical C mpositi ns and Vaccines 

The present invention further provides pharmaceutical agents which can be used to modulate the growth or path- 
ogenicity of Staphylococcus aureus, or another related organism, in vivo or in vitro. As used herein, a "pharmaceutical 
agent" is defined as a composition of matter which can be formulated using known techniques to provide a pharma- 
ceutical compositions. As used herein, the "pharmaceutical agents of the present invention" refers the pharmaceutical 
agents which are derived from the proteins encoded by the ORFs of the present invention or are agents which are 
identified using the herein described assays. 

As used herein, a pharmaceutical agent is said to "modulate the growth or pathogenicity of Staphylococcus aureus 
or a related organism, in vivo or in vitro," when the agent reduces the rate of growth, rate of division, or viability of the 
organism in question. The pharmaceutical agents of the present invention can modulate the growth or pathogenicity 
of an organism in many fashions, although an understanding of the underlying mechanism of action is not needed to 
practice the use of the pharmaceutical agents of the present invention. Some agents will modulate the growth or path- 
ogenicity by binding to an important protein thus blocking the biological activity of the protein, while other agents may 
bind to a component of the outer surface of the organism blocking attachment or rendering the organism more prone 
to act the bodies nature immune system. Alternatively, the agent may comprise a protein encoded by one of the ORFs 
of the present invention and serve as a vaccine. The development and use of vaccines derived from membrane asso- 
ciated polypeptides are well known in the art. The inventors have identified particularly preferred immunocenic Sta- 
phylococcus aureus polypeptides for use as vaccines. Such immunogenic polypeptides are described aboveand sum- 
marized in Table 4, below. 

As used herein, a "related organism" is a broad term which refers to any organism whose growth or pathogenicity 
can be modulated by one of the pharmaceutical agents of the present invention. In general, such an organism will 
contain a homolog of the protein which is the target of the pharmaceutical agent or the protein used as a vaccine. As 
such, related organisms do not need to be bacterial but may be fungal or viral pathogens. 

The pharmaceutical agents and compositions of the present invention may be administered in a convenient man- 
ner, such as by the oral, topical, intravenous, intraperitoneal, intramuscular, subcutaneous, intranasal or intradermal 
routes. The pharmaceutical compositions are administered in an amount which is effective for treating and/or proph- 
ylaxis of the specific indication. In general, they are administered in an amount of at least about 1 mg/kg body weight 
and in most cases they will be administered in an amount not in excess of about 1 g/kg body weight per day. In most 
cases, the dosage is from about 0.1 mg/kg to about 10 g/kg body weight daily, taking into account the routes of ad- 
ministration, symptoms, etc. 

35 The agents of the present invention can be used in native form or can be modified to form a chemical derivative. 

As used herein, a molecule is said to be a "chemical derivative" of another molecule when it contains additional chemical 
moieties not normally a part of the molecule. Such moieties may improve the molecule's solubility, absorption, biological 
half life, eta The moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any undesirable 
side effect of the molecule, eta Moieties capable of mediating such effects are disclosed in, among other sources 
40 REMINGTON'S PHARMACEUTICAL SCIENCES (1 980) cited elsewhere herein. 

For example, such moieties may change an immunological character of the functional derivative, such as affinity 
for a given antibody. Such changes in immunomodulation activity are measured by the appropriate assay, such as a 
competitive type immunoassay. Modifications of such protein properties as redox or thermal stability, biological half- 
life, hydrophobicity, susceptibility to proteolytic degradation or the tendency to aggregate with carriers or into multimers 
45 also may be effected in this way and can be assayed by methods well known to the skilled artisan. 

The therapeutic effects of the agents of the present invention may be obtained by providing the agent to a patient 
by any suitable means (e.g., inhalation, intravenously, intramuscularly, subcutaneously, enterally, or parenterally). It is 
preferred to administer the agent of the present invention so as to achieve an effective concentration within the blood 
or tissue in which the growth of the organism is to be controlled. To achieve an effective blood concentration, the 
preferred method is to administer the agent by injection. The administration may be by continuous infusion, or by single 
or multiple injections. 

!n providing a patient with one of the agents of the present invention, the dosage of the administered agent will 
vary depending upon such factors as the patient's age, weight, height, sex, general medical condition, previous medical 
history, eta In general, it is desirabl to provide the recipient with a dosage of agent which is in the range of from about 
1 pg/kg to 10 mg/kg (body weight of patient), although a lower or higher dosag may b administered. The therapeu- 
tically effective dose can be low red by using combinations of the agents of the pr sent invention or another agent. 

As used herein, two or more compounds or agents are said to be administ red "in combination" with each other 
when either (1) the physiological effects of each compound, or (2) the serum concentrations of each compound can 
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be measured at the same time. The composition of the present invention can be administered concurrently with, prior 
to, or following the administration of the other agent. 

The agents of the present invention are intended to be provided to recipient subjects in an amount sufficient to 
decrease the rate of growth (as defined above) of the target organism. 

5 The administration of th agent(s) of the invention may be for either a "prophylactic" or therapeutic' purpose. 

When provided prophylactically, the agent(s) are provided in advance of any symptoms indicative of the organisms 
growth. The prophylactic administration of the agent(s) serves to prevent, attenuate, or decrease the rate of onset of 
any subsequent infection. When provided therapeutically, the agent(s) are provided at (or shortly after) the onset of an 
indication of infection. The therapeutic administration of the compound(s) serves to attenuate the pathological symp- 

10 toms of the infection and to increase the rate of recovery. 

The agents of the present invention are administered to a subject, such as a mammal, or a patient, in a pharma- 
ceutical^ acceptable form and in a therapeutically effective concentration. A composition is said to be "pharmacolog- 
ically acceptable" if its administration can be tolerated by a recipient patient. Such an agent is said to be administered 
in a "therapeutically effective amount" if the amount administered is physiologically significant. An agent is physiolog- 

'5 ically significant if its presence results in a detectable change in the physiology of a recipient patient. 

The agents of the present invention can be formulated according to known methods to prepare pharmaceutical 
useful compositions, whereby these materials, or their functional derivatives, are combined in admixture with a phar- 
maceutical^ acceptable carrier vehicle. Suitable vehicles and their formulation, inclusive of other human proteins, e. 
g., human serum albumin, are described, for example, in REMINGTON'S PHARMACEUTICAL SCIENCES, 16 th Ed., 

20 Osol, A., Ed., Mack Publishing, Easton PA (1 980). In order to form a pharmaceutical^ acceptable composition suitable 
for effective administration, such compositions will contain an effective amount of one or more of the agents of the 
present invention, together with a suitable amouni of carrier vehicle. 

Additional pharmaceutical methods may be employed to control the duration of action. Control release preparations 
may be achieved through the use of polymers to complex or absorb one or more of the agents of the present invention. 

25 The controlled delivery may be effectuated by a variety of well known techniques, including formulation with macro- 
molecules such as, for example, polyesters, polyamino acids, polyvinyl, pyrrolidone, ethylenevinylacetate, methylcel- 
lulose, carboxymethylcellulose, or protamine, sulfate, adjusting the concentration of the macromolecuies and the agent 
in the formulation, and by appropriate use of methods of incorporation, which can be manipulated to effectuate a desired 
time course of release. Another possible method to control the duration of action by controlled release preparations is 

30 to incorporate agents of the present invention into particles of a polymeric material such as polyesters, polyamino 
acids, hydrogels, poly (lactic acid) or ethylene viny (acetate copolymers. Alternatively, instead of incorporating these 
agents into polymeric particles, it is possible to entrap these materials in microcapsules prepared, for example, by 
coacervation techniques or by interfacial polymerization with, for example, hydroxymethylcellulose or gelatine-micro- 
capsules and poty(methylmethacylate) microcapsules, respectively, or in colloidal drug delivery systems, for example, 

35 liposomes, albumin microspheres, mtcroemulsions, nanoparticles, and nanocapsules or in macroemulsions. Such tech- 
niques are disclosed in REMINGTON'S PHARMACEUTICAL SCIENCES (1980). 

The invention further provides a pharmaceutical pack or kit comprising one or more containers filled with one or 
more of the ingredients of the pharmaceutical compositions of the invention. Associated with such containers) can be 
a notice in the form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals 

-to or biological products, which notice reflects approval by the agency of manufacture, use or sale lor human adminis- 
tration. 

In addition, the agents of the present invention may be employed in conjunction with other therapeutic compounds. 
6. Shot-Gun Approach to Megabase DNA Sequencing 

45 

The present invention further demonstrates that a large sequence can be sequenced using a random shotgun 
approach. This procedure, described in detail in the examples that follow, has eliminated the up front cost of isolating 
and ordering overlapping or contiguous subclones prior to the start of the sequencing protocols. 

Certain aspects of the present invention are described in greater detail in the examples that follow. The examples 
50 are provided by way of illustration. Other aspects and embodiments of the present invention are contemplated by the 
inventors, as will be clear to those of skill in the art from reading the present disclosure. 
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ILLUSTRATIVE EXAMPLES 
LIBRARIES AND SEQUENCING 
1. Shotgun Sequencing Pr bability Analysis 

The overall strategy for a shotgun approach to whole genome sequencing follows from the Lander and Waterman 
Landerman and Waterman, Genomics 2: 231 (1 988)) application of the equation for the Poisson distribution Aborting 
to this treatment the probability, P 0 , that any grven base in a sequence of size L, in nuc.eotides, is not sequenced afS 
a certam amount, n m nuc.eotides, of random sequence has been determined can be ca.cu.ated by the equation P 0 
= e where m « L/n, the fold coverage." For instance, for a genome of 2.8 Mb, m=l when 2.8 Mb of sequence has 
been randomly generated (1X coverage). A, that point, P 0 = e- = 0.37. The probability that any given base has not 
been sequenced ,s the same as the probability that any region of the whole sequenced has not been determined and 
therefore, ,, equ,v..ent to the fraction of the who.e sequence that has yet to be determined. Thus, at one4oW coveraqe 
approximately 37% of a po.ynuc.eotide of size L, in nuc.eotides has no, been sequenced. When 14 Mb S sequence 
Jt MK enerated ' COVera98 iS 5X f ° r 3 2 8 Mb and the unse q u ^nced fraction drops to .0067 or 0.67<>/ o . 5X coverage 
IMt^L ! 6qUenCe Ca " b f attained bv «K,«encing approximately 17.000 random clones from both insert ends wrth 
an average sequence read length of 41 0 bp. 

Similarly, the total gap length. G, is determined by the equation G = Le-» and the average gap size g follows the 

nTSide^ Mb 2? 5X COV6ra9e ,eaV6S ab ° Ut ^ aVera9in9 ^ 82 bP h a ™ * * 

The treatment above is essentially that of Lander and Waterman, Genomics 2: 231 (1988). 

2. Random Library Construction 

•> 

h^Ih 0 ^ t0 apprOXimate the random model described above during actual sequencing, a nearly ideal library of 
cloned genome fragments ,s required. The following library construction procedure was developed to achieve this end 
Staphylococcus aureus DNA was prepared by phenol extraction. A mixture containing 600 ug DNA in 3 3 ml of 
300 mM sodium acetate. 10 mM Tris-HCI. 1 mM Na-EDTA, 30% glycerol was sonfcated for 1 min. at o-C in a Branson 

SSttin9 " Sin9 3 3 ^ ° r0be ^ ~*~ DNA was ethano. precipitated 
To create blunt-ends, a 1 00 ul aliquot of the resuspended DNA was digested with 5 units of BAL31 nuclease (New 
England B.oLabs for 10 min at 30"C in 200 u. BAL31 buffer . The digested DNA was phenol-extracted, efhano -pre 

is ZnlSurf ^ TE bUf,er ' ^ ^ s ™**<*™*°* by electrophoresis through a L0% low mlmng 

55 emperature agarose gel. The section containing DNA fragments 1 .6-2.0 kb in size was excised from the gel and the 
LGT agarose was melted and the resulting solution was extracted with phenol to separate the agarose from the DNA 
DNA was ethanol precipitated and redissolved in 20 ul of TE buffer for ligation to vector 

in^i! ™~ S ? P ''f^ procedure was used to P rodu <* a Plasmid library with 97% inserts, of which >99% were single 
40 Tnd 2oSnno f ! , S! 0 / 1 T^ t {S0 2 u 9 of DNA fragments. 2 ug pUC18 DNA (Pharmacia) cut with Sma. 

TSr TT^-S. , baCtenal alka " ne P nos P ha,ase - and 1 0 "^s of T4 ligase (GIBCO/BRL) and was incubated 
HI 1 U S "B?* - mixture then was phenol extracted and ethanol precipitated, and the precipitated DNA 
was d,ssolved,n 20 ul TE buffer and electrophoresed on a 1 .0% low melting agarose gel. Discrete bands in a tedder 
were visualized by ethidium bromide-staining and UV illumination and identified by size as insert (i), vector (v) v + i 

4S ^Z^IrTZ P °^°L the 9Sl containi "9 v+i DNA was exc is<* ^ the v + i DNA was recovered and resuspended 
nto 20 ul TE. The v + i DNA then was blunt-ended by T4 polymerase treatment for 5 min. at 37» C in a reaction mixture 

Si ^' n,n9 ^ ri ,inearS ' 500 UM 6aCh ° f me 4 dNTPSl and 9 unrts of T4 Phrase (New England BioLabs), 
2l£ L^TTn "?£ b !? er «> ndltten «- Afte ' P"enol extraction and ethanol precipitation the repaired v + i linears were 
dissolved in 20 ul TE. The final ligation to produce circles was carried out in a 50 ul reaction containing 5 ul of v+i 

so storeda^O'c"' 18 * ^ * ^ OVemi9h, A,,er 10 min at 70 ° C ,oltowin 9 *e reaction mixture was 

This two-stage procedure resulted in a molecularty random collection of single-insert plasmid recombinants with 
minimal contamination from double-insert chimeras (<1 %) or free vector (<3%). 

Since deviation from randomness can arise from propagation the DNA in the host, E.co/ihost cells deficient in all 
recombination and restriction functions (A. Greener, Strategies 3 (1):5 (1990)) were used to prevent rearrangements 
* k SS ° f C !^ e t by restriction - Furthermore, transformed c lis were plated directly on antibiotic diffusion 
plates to avoid the usual broth recov ry phase which allows multiplication and selection of the most rapidly growing cells 
Pla mg was carried out as follows. A 100 ul aliquot of Epicurian Coli SURE II Supercomoetent Cells (Stratagene 
200152) was thawed on ice and transferred to a chilled Falcon 2059 tube on ice. A 1.7 ul aliquot of 1 42 M beta- 
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mercaptoethanol was added to the aliquot of cells to a final concentration of 25 mM. Cells were incubated on ice for 
10 min. A 1 ul aliquot of the final ligation was added to the cells and incubated on ice for 30 min. The cells were heat 
pulsed for 30 sec. at 42° C and placed back on ice for 2 min. The outgrowth period in liquid culture was eliminated 
from this protocol in order to minimize the preferential growth of any given transformed cell. Instead the transformation 

5 mixture was plated directly on a nutrient rich SOB plate containing a 5 ml bottom layer of SOB agar (5% SOB agar: 
20 g tryptone, 5 g yeast extract, 0.5 g NaCI, 1 .5% Difco Agar per liter of media). The 5 ml bottom layer is supplemented 
with 0.4 ml of 50 mg/ml ampicilltn per 100 ml SOB agar. The 15 ml top layer of SOB agar is supplemented with 1 ml 
X-Gal (2%), 1 ml MgCI 2 (1 M), and 1 ml MgSO 4 /100 ml SOB agar. The 15 ml top layer was poured just prior to plating. 
Our titer was approximately 100 colonies/10 ul aliquot of transformation. 

io All colonies were picked for template preparation regardless of size. Thus, only clones lost due to "poison- DNA 

or deleterious gene products would be deleted from the library, resulting in a slight increase in gap number over that 
expected. 
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3. Random DNA Sequencing 

High quality double stranded DNA plasmid templates were prepared using an alkaline lysis method developed in 
collaboration with SPrime 3Prime Inc. (Boulder, CO). Plasmid preparation was performed in a 96-well format for all 
stages of DNA preparation from bacterial growth through final DNA purification. Average template concentration was 
determined by running 25% of the samples on an agarose gel. DNA concentrations were not adjusted. 

Templates were also prepared from a Staphylococcus aureus lambda genomic library. An unamplified library was 
constructed in Lambda DASH II vector (Stratagene). Staphylococcus aureus DNA (> 100 kb) was .partially .digested in 
a reaction mixture (200 ul) containing 50 ug DNA, 1X Sau3AI buffer, 20 units Sau3AI for 6 min. at 23 C. The digested 
DNA was phenol-extracted and centrifuges over a 10- 40% sucroce gradient. Fractions containing genomic DNA of 
15-25 kb were recovered by precipitation . One ul of fragments was used with 1 ul of DASHM vector (Stratagene) in 
the recommended ligation reaction. One ul of the ligation mixture was used per packaging reaction following the rec- 
ommended protocol with the Gigapack II XL Packaging Extract Phage were plated directly without amplification from 
the packaging mixture (after dilution with 500 ul of recommended SM buffer and chloroform treatment) Yield was about 
2.5x10 9 pfu/ul. 

An amplified library was prepared from the primary packaging mixture according to the manufacturer's protocol. 
The amplified library is stored frozen in 7% dimethylsulfoxide. The phage titer is approximately 1x10 s pfu/ml. 

Mini-liquid lysates (0.1 ul) are prepared from randomly selected plaques and template is prepared by long range 
PCR. Samples are PCR amplified using modified T3 and T7 primers, and Elongase Supermix (LTI). 

Sequencing reactions are carried out on plasmid templates using a combination of two workstations (BIOMEK 
1000 and Hamilton Microlab 2200) and the Perkin-Elmer 9600 thermocycler with Applied Biosystems PRISM Ready 
Reaction Dye Primer Cycle Sequencing Kits for the M1 3 forward (M1 3-21) and the M13 reverse (M13RP1) primers. 
Dye terminator sequencing reactions are carried out on the lambda templates on a Perkin-Elmer 9600 Thermocycler 
using the Applied Biosystems Ready Reaction Dye Terminator Cycle Sequencing kits. Modified T7 and T3 primers are 
used to sequence the ends of the inserts from the Lambda DASH II library. Sequencing reactions are on a combination 
of AB 373 DNA Sequencers and ABI 377 DNA sequencers. All of the dye terminator sequencing reactions are analyzed 
using the 2X 9 hour module on the AB 377. Dye primer reactions are analyzed on a combination of ABI 373 and ABI 
377 DNA sequencers. The overall sequencing success rate very approximately is about 85% for M1 3-21 and M1 3RP1 
sequences and 65% for dye-terminator reactions. The average usable read length is 485 bp for M13-21 sequences, 
445bp for M13RP1 sequences, and 375 bp for dye-terminator reactions. 

45 4: Protocol for Automated Cycle Sequencing 

The sequencing was carried out using Hamilton Microstation 2200, Perkin Elmer 9600 thermocyclers, ABI 373 
and ABI 377 Automated DNA Sequencers. The Hamilton combines pre-aliquoted templates and reaction mixes con- 
sisting of deoxy- and dideoxynucleotides, the thermostable Taq DNA polymerase, fluorescent ly- label led sequencing 
primers, and reaction buffer. Reaction mixes and templates were combined in the wells of a 96-well thermocycling 
plate and transferred to the Perkin Elmer 9600 thermocycler. Thirty consecutive cycles of linear amplification (i.e. . , one 
primer synthesis) steps were performed including denaturation, annealing of primer and template, and extension; i.e. , 
DNA synthesis. A heated lid with rubber gaskets on the thermocycling plate prevents evaporation without the need for 
an oil overlay. 

Two sequencing protocols were used: one for dye-labelled primers and a second for dye-labelled dideoxy chain 
terminators. The shotgun sequencing involves use of four dye-labelled sequencing primers, on for each of the four 
terminator nucleotide. Each dye-primer was labelled with a different fluoresc nt dy . permitting th four individual 
reactions to be combined into one lane of the 373 or 377 DNA Sequencer for electrophoresis, det ction, and base- 
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calling. ABI currently supplies premixed reaction mixes in bulk packages containing all the necessary non-template 
reagents for sequencing. Sequencing can be done with both plasmid and PCR-generated templates with both dye- 
primers and dye- terminators with approximately equal fidelity, although plasmid templates generally give longer usable 
sequences. 

Thirty-two reactions were loaded per ABI 373 Sequencer each day and 96 samples can be loaded on an ABI 377 
per day. Electrophoresis was run overnight (ABI 373) or for 2 1/2 hours (ABI 377) following the manufacturer's protocols. 
Following electrophoresis and fluorescence detection, the ABI 373 or ABI 377 performs automatic lane tracking and 
base-calling. The lane-tracking was confirmed visually. Each sequence electropherogram (or fluorescence lane trace) 
was inspected visually and assessed for quality. Trailing sequences of low quality were removed and the sequence 
itself was loaded via software to a Sybase database (archived daily to 8mm tape). Leading vector polylinker sequence 
was removed automatically by a software program. Average edited lengths of sequences from the standard ABI 373 
or ABI 377 were around 400 bp and depend mostly on the quality of the template used for the sequencing reaction. 

INFORMATICS 

1. Data Management 

A number of information management systems for a large-scale sequencing lab have been developed. (For review 
see, for instance, Kerlavage etai, Proceedings of the Twenty-Sixth Annual Hawaii International Conference on System 
Sciences, IEEE Computer Society Press, Washington D. C. 585 (1993)) The system used to collect and assemble 
the sequence data was developed using the Sybase relational database management system and was designed to 
automate data flow whereever possible and to reduce user error. The database stores and correlates all information 
collected during the entire operation from template preparation to final analysis of the genome. Because the raw output 
of the ABI 373 Sequencers was based on a Macintosh platform and the data management system chosen was based 
on a Unix platform, it was necessary to design and implement a variety of multi- user, client-server applications which 
allow the raw data as well as analysis results to flow seamlessly into the database with a minimum of user effort. 

2. Assembly 



An assembly engine (TIGR Assembler) developed for the rapid and accurate assembly of thousands of sequence 
fragments was enployed to generate contigs. The TIGR assembler simultaneously clusters and assembles fragments 
of the genome. In order to obtain the speed necessary to assemble more than 10 4 fragments, the algorithm builds a 
hash table of 12 bp oligonucleotide subsequences to generate a list of potential sequence fragment overlaps. The 
number of potential overlaps for each fragment determines which fragments are likely to fall into repetitive elements. 
Beginning with a single seed sequence fragment, TIGR Assembler extends the current contig by attempting to add 
the best matching fragment based on oligonucleotide content. The contig and candidate fragment are aligned using a 
modified version of the Smith-Waterman algorithm which provides for optimal gapped alignments (Waterman, M. S., 
Methods in Enzymology J64: 765 (1988)). The contig is extended by the fragment only if strict criteria for the quality 
of the match are met. The match criteria include the minimum length of overlap, the maximum length of an unmatched 
*o end, and the minimum percentage match. These criteria are automatically lowered by the algorithm in regions of minimal 
coverage and raised in regions with a possible repetitive element. The number of potential overlaps for each fragment 
determines which fragments are likely to fall into repetitive elements. Fragments representing the boundaries of repet- 
itive elements and potentially chimeric fragments are often rejected based on partial mismatches at the ends of align- 
ments and excluded from the current contig. TIGR Assembler is designed to take advantage of clone size information 
coupled with sequencing from both ends of each template. It enforces the constraint that sequence fragments from 
two ends of the same template point toward one another in the contig and are located within a certain ranged of base 
pairs (definable for each clone based on the known clone size range for a given library). 



3. Identifying Genes 



The predicted coding regions of the Staphylococcus aureus genome were initially defined with the program zorf, 
which finds ORFs of a minimum length. The predicted coding region sequences were used in searches against a 
database of all Staphylococcus aureus nucleotide sequences from GenBank (release 92.0), using the BLASTN search 
method to identify overlaps of 50 or more nucleotides with at least a 95% identity. Those ORFs with nucleotide sequence 
55 matches ar shown in Table 1 . The ORFs without such matches w re translated to protein sequences and and com- 
pared to a non-redundant databas of known proteins generated by combining th Swiss-prot, PIR and GenPept 
databases. ORFs of at least 80 amino acids that matched a database protein with BLASTP probability less than or 
equal to 0.01 are shown in Table 2. The table also lists assigned functions based on the clos st match in the databases. 
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ORFs of at least 120 amino acids that did not match protein or nucleotide sequences in the databases at these levels 
are shown in Table 3. 

ILLUSTRATIVE APPLICATIONS 

5 

1 . Production of an Antibody to a Staphylococcus aureus Prot in 

Substantially pure protein or polypeptide is isolated from the transfected or transformed cells using any one of the 
methods known in the art. The protein can also be produced in a recombinant prokaryotic expression system, such as 
10 e. coli, or can by chemically synthesized. Concentration of protein in the final preparation is adjusted, for example, by 
concentration on an Amicon fitter device, to the level of a few microg rams/ml. Monoclonal or polyclonal antibody to the 
protein can then be prepared as follows. 



is 



30 



2. Monoclonal Antibody Production by Hybridoma Fusion 



Monoclonal antibody to epitopes of any of the peptides identified and isolated as described can be prepared from 
murine hybridomas according to the classical method of Kohler, G. and Milstein, C, Nature 256:495 (1975) or modifi- 
cations of the methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein 
over a period of a few weeks. The mouse is then sacrificed, and the antibody producing cells of the spleen isolated. 

20 The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells 
destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused 
ceiis are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. 
Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay 
procedures, such as ELISA, as originally described by Engvall, E., Meth. EnzymoL 70:419 (1980), and modified rheth- 

2S ods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. 
Detailed procedures for monoclonal antibody production are described in Davis, L. et aL Basic Methods in Molecular 
Biology Elsevier, New York. Section 21-2 (1989). 



3. Polyclonal Antibody Production by Immunization 



Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by im- 
munizing suitable animals with the expressed protein described above, which can be unmodified or modified to enhance 
immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and 
the host species. For example, small molecules tend to be less immunogenic than other and may require the use of 

35 carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or 
excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigenadministered at multiple 
intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in vaitukaitis, 
J. et ai, J. Clin. Endocrinol. Metab. 33:988-991 (1971). 

Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as de- 

40 termined semi-quant itatively, for example, by double immunodiffusion in agar against known concentrations of the 
antigen, begins to fall. See, for example, Ouchterlony, O. et aL, Chap. 19 in: Handbook of Experimental Immunology, 
Wier, D., ed, Blackwell (1973). Plateau concentration of antibody is usually in the range of 0. 1 toO. 2 mg/ml of serum 
(about 1 2M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, 
for example, by Fisher, D., Chap. 42 in: Manual of Clinical Immunology, second edition, Rose and Friedman, eds., Amer. 

45 Soc. For Microbiology, Washington, D. C. (1980) 

Antibody preparations prepared according to either protocol are useful in quantitative immunoassays which de- 
termine concentrations of antigen-bearing substances in biological samples; they are also used semi- quant itatrvety 
or qualitatively to identify the presence of antigen in a biological sample. In addition, they are useful in various anmai 
models of Staphylococcal disease known to those of skill in the art as a means of evaluating the protein used to make 

so the antibody as a potential vaccine target or as a means of evaluating the antibody as a potential immunothereapeutic 
reagent. 

3. Preparation of PCR Primers and Amplification of DNA 

ss Various fragments of the Staphylococcus aureus genome, such as those of Tables 1 -3 and SEQ ID NOS 1 -5 1 9 1 

can be used, in accordance with the present invention, to prepare PCR primers for a variety of uses. The PCR primers 
are preferably at least 15 bases, and more preferably at least 18 bases in length. When selecting a primer sequenc . 
it is preferred that the primer pairs have approximately the same G/C ratio, so that melting temperatures are approxi- 
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mately the same. The PCR primers and amplified DNA of this Example find use in the Examples thaHollow 
4. Geneexpr ssion from DNA Sequenc s C rr sponding t ORFs 

5 A fragment of the Staphylococcus aureus genome provided in Tables 1-3 is introduced into an expression vector 

using conventional technology. Techniques to transfer cloned sequences into expression vectors that direct protein 
translation in mammalian, yeast, insect or bacterial expression systems are well known in the art. Commercially avail- 
able vectors and expression systems are available from a variety of suppliers including Stratagene (La Jolla, California), 
Promega (Madison, Wisconsin), and Invitrogen (San Diego, California). If desired, to enhance expression and facilitate 

10 proper protein folding, the codon context and codon pairing of the sequence may be optimized for the particular ex- 
pression organism, as explained by Hatfield etai, U. S. Patent No. 5,082,767, incorporated herein by this reference. 

The following is provided as one exemplary method to generate polypeptide(s) from cloned ORFs of the Staphy- 
lococcus aureus genome fragment. Bacterial ORFs generally lack a poly A addition signal. The addition signal sequence 
can be added to the construct by, for example, splicing out the poly A addition sequence from pSG5 (Stratagene) using 

is Bgll and Sail restriction endonuclease enzymes and incorporating it into the mammalian expression vector pXT1 (Strat- 
agene) for use in eukaryotic expression systems. pXT1 contains the LTRs and a portion of the gag gene of Moloney 
Murine Leukemia Virus. The positions of theLTRs in the construct allow efficient stable transfection. The vector includes 
the Herpes Simplex thymidine kinase promoter and the selectable neomycin gene. The Staphylococcus aureus DNA 
is obtained by PCR from the bacterial vector using oligonucleotide primers complementary to the Staphylococcus 

20 aureus DNA and containing restriction endonuclease sequences for Pstl incorporated into the 5' primer and Bgll I at 
the 5' end of the corresponding Staphylococcus aureus DNA 3' primer, taking care to ensure that the Staphylococcus 
aureus DNA is positioned such that its followed with the poly A addition sequence. The purified fragment obtained from 
the resulting PCR reaction is digested with Pstl, blunt ended with an exonuclease, digested with Bglll, purified and 
ligated to pXT1 , now containing a poly A addition sequence and digested Bglll. 

2S The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc., Grand Island, 

New York) under conditions outlined in the product specification. Positive transfectants are selected after growing the 
transfected cells in 600 ug/ml G41 8 (Sigma, St. Louis, Missouri). The protein is preferably released into the supernatant. 
However if the protein has membrane binding domains, the protein may additionally be retained within the cell or 
expression may be restricted to the cell surface. Since it may be necessary to purify and locate the transfected product, 

30 synthetic 15-mer peptides synthesized from the predicted Staphylococcus aureus DNA sequence are injected into 
mice to generate antibody to the polypeptide encoded by the Staphylococcus aureus DNA. 

Alternative and if antibody production is not possible, the Staphylococcus aureus DNA sequence is additionally 
incorporated into eukaryotic expression vectors and expressed as, for example, a gtobin fusion. Antibody to the globin 
moiety then is used to purify the chimeric protein. Corresponding protease cleavage sites are engineered between the 

35 globin moiety and the polypeptide encoded by the Staphylococcus aureus DNA so that the latter may be freed from 
the formed by simple protease digestion. One useful expression vector for generating globin chimerics is pSG5 (Strat- 
agene). This vector encodes a rabbit globin. Intron II of the rabbit globin gene facilitates splicing of the expressed 
transcript, and the poly adenylat ion signal incorporated into the construct increases the level of expression. These 
techniques are well known to those skilled in the art of molecular biology. Standard methods are published in methods 

40 texts such as Davis et ai, cited elsewhere herein, and many of the methods are available from the technical assistance 
representatives from Stratagene, Life Technologies, Inc.. or Promega. Polypeptides of the invention also may be pro- 
duced using in vitro translation systems such as in vitro ExpressTM Translation Kit (Stratagene). 

While the present invention has been described in some detail for purposes of clarity and understanding, one 
skilled in the art will appreciate that various changes in form and detail can be made without departing from the true 

45 scope of the invention. 

All patents, patent applications and publications referred to above are hereby incorporated by reference 
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Table 4 



ORF 


SEQ ID NO 


BLAST 


Antigenic 


Regions 










HOMOLOG 


Region 1 


Region 2 


Region 3 


Region 4 


168.6 i 


5192 


lipoprotein 


36-45 


84-103 


152-161 


176-185 


238_1 1 


5193 


chrA 


21-39 


48-58 


84-95 


232-249 


51_2 


5194 


OppB gene product (B. sub* 


20-36 


70-79 


100-1_12 


121-131 


278_3 


5195 


lipoprotein 1 


20-29 


59-73 


85-97 


162-171 


276_2 


5196 


lipoprotein 


21-33 


65-74 


177-186 


21 1-220 


45_4 


5197 


ProX 


28-37 


59-69 


85-100 I 


1 20-1 29 


31 S_8 


5198 


hyoothetical protein 


45-54 


88-97 


182-192 


243-253 


! 1 S4_1 5 


5199 


■ unknown ! 


31-40 


48-58 


79-88 


95-104 


228_3 


5200 


• unknown 


25-38 i 


40-52 


64-74 


80-89 


228_6 


5201 


■ unknown i 


29-41 ! 


89-101 


128-143 


173-184 


50_1 


5202 


unknown 


21-33 


52-61 


168-182 


197-206 


1 1 2_7 


5203 


' iron-btndtng periptasmic 


21-31 l 


58-67 


92-101 


111-120 


442.1 


5204 


'. unknown ' 


30-39 i 


91-100 


122-137 


182-192 


66_2 


5205 


'. unknown 


50-59 ! 


104-116 


127-136 


167-182 


304_2 


5206 


•Q-binding periptasmic i 


19-28 \ 


48-57 


75-84 


103-116 


44_1 


5207 


hypothetical protein 


27-36 ! 


86-95 


129-138 ! 


192-201 


! 161_4 


S208 


:SphX 


27-44 | 


149-161 


166-175 


201-210 


46_5 


5209 


• cmpC (permease) 


21-33 1 


61-70 


83-92 


. 100-109 


942.1 


5210 


. traH [Ptasmtd pSK41 ] ; 


83-92 


109-1 18 


127-142 . ! 


5_4 


521 1 


ORFfS aureus) 


12-22 


87-96 


111-120 


151-160 


20_4 


5212 


peptidoglycan hydrolase (S^ 


24-34 


129-138 


141-150 


161-171 


328_2 


5213 


'lipoprotein (H. flu) i 


81-90 


123-133 


290-299 




520_2 


5214 


! fibronectin binding protein - 


44-54 


63-79 


81-90 


95-110 


771_1 


5215 


! emml gene product (S. pyc 


30-39 


65-82 


96-106 


112-121 


999_1 


5216 


predicted trithorax prot. (D 


7-16 


120-129 


157-166 




853.1 


5217 


ORF2136 (Marchantia pohyr 


43-52 


88-97 


1 02-1 1 1 ! 


287.1 


5218 


psaA homolog 


1 3-22 


28-44 


72-82 


114-124 


288.2 


5219 


cell wall enzyme 


14-23' 


89-98 






596.2 


5220 


penicillin binding protein 2b 


40-49 


59-68 


76-87 


106-115 


217.5 


1 5221 


fibronectin/fibrinogen bindii 


28-37 


40-49 


! 62-71 


93-1 1 1 


217.6 


I 5222 


fibronectin/fibrinogen bp 


10-19 


31-40 


1 54-62 


73-92 


S28.3 


1 5223 


myosin cross reactive protc 


4-13 


29-47 


60-73 


90-99 


171.1 1 


5224 


EF 


20-31 


91-1 10 




63.4 


5225 


•penicillin binding protein 2b 


12-21 i 


59-68 


95-104 ! 


353.2 


5226 




46-55 1 


62-71 




74311 


5227 


29 kDa protein in fimA regi- 


23-32 


68-79 


94-103 


175-184 


342.4 


1 5228 


Twitching motility 


10-19 


48-60 


83-92 


111-121 


69.3 


i 5229 


arabinogalactan protein 


97-106 


132-141 


158-167 


180-189 


70.6 


5230 


nodulin 


36-45 


48-57 


* 137-160 


179-188 


129.2 


I 5231 


glycerol diester phosphodie 


8-17 


41-50 


55-74 


97-106 


58.5 


5232 


PBP (S. aureus) 


26-35 


70-79 


1 1 7-1 26 


152-161 


188.3 


5233 


MHC class II anaJog (S. aure 


72-81 


94-103 


115-124 


136-145 


236.6 


5234 


htstidine kinase domain (Die 


24-33 


52-67 


81-94 


106-121 


310.8 


5235 


clumping factor (S. aureus) 


59-71 


77-86 


93-102 


118-127 


601.1 


5236 


novel antigen/ORF2 (S. aui 


45-S4 


91-104 


1 08-1 1 7 


186-195 


544.3 


5237 


ORF YJR1 Sic (S. cerevisae! 


76-90 


101-111 


131-140 


154-164 


662.1 


5238 


MHC class II anaJog (S. aure 


22-32 


71-80 


89-98 


114-122 


87.7 


5239 


5* nucleotidase precursor (' 


29-45 


62-71 


105-114 


125-137 


120.1 


5240 


B65G qene product (B. sub 


1 02-1 1 f 
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Table 4 



ORF 




Antigenic 1 Regions 


(cont) 








Region 5 


Region 6 


Region 7 


Region 8 ■ 


Region 9 


Region 1 0 


168.6 


244-272 


303-315 ! 










238.1 


260-269 


291-301 j 


308-317 i 








51.2 i 


140-152 


188-208 ! 


211-220 ! 


256-266 : 


273-283 




278.3 1 


198-209 


i 


i 


i 






276.2 


255-268 


t 


1 


I 






45.4 


177-199 


221-230 1 


234-243 ! 


268-279 ! 


284-293 


304-313 


316.8 




154.15 


148-157 


177-187 


202-211 i ! 




228.3 


101-119 


139-154 


166-181 








228.6 ! 










50.1 1 










112.7 


136-149 


197-21 1 


218-229 


253-273 * 


... 1 




442.1 


199-210 


247-257 


264-277 


287-309 i 




66.2 












304.2 


178-187 


250-259 










44.1 


• 










161.4 i 








46.5 


131-141 


162-176 


206-215 


243-252 


264-273 


r 285-294 


942.1 




e 


1 
i 


5.4 


1 89-205 


230-239 


246-264 


301-318 


340-354 


1 378-387 


20.4 


202-212 


217-234 


260-275 


314-336 


366-373 


380-391 


328.2 ! 






1 
t 


520.2 1 




I j 


771.1 


i 145-154 




I 


I j 


999.1 ! ! 




853.1 ! I ! ! i 


287.1 


| 1 54-1 64 




i 




288.2 ! 1 


I ; 


596.2 


121-130 








217.5 


244-253 


259-268 


288-297 


302-311 1 


217.6 


' 144-158 


174-183 


188-197 


207-216 


! 226-242 




528.3 i 




171_11 ! 


i ! 


63.4'^ . i t 


r 
1 


353_2 ! i 


! 


74J.1 


: 197-207 




j 


i 


t 


1 


342_4- ; 


1 : i 


69.3 


195-211 




i 

i 


70.6 


206-215 


263-272 


; 291-301 


■ 331-340 


358-371 


390-414 


129.2 


117-127 


141-157 


168-183 


202-21 1 


222-231 


261-270 


58.5 


1 84-203 


260-269 


275-299 


330-344 


372-381 


424-433 


188_3 ! 


236.6 


138-147 


163-172 


187-198 


244-261 


268-278 


308-317 


310.8 


131-140 


144-153 


177-186 


! 190-199 


204-21 3 


'216-227 


601.1 


208-218 












544.3 


170-179 


184-193 


! 224-235 


274-287 


327-336 


352-361 


662 1 


87 7 ■ 1 


120.1 i : 
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Table 4 



ORF 


Antigenic! Regions 


(cont) 






Region 1 1 Region 1 2 


Region 13 


Region 1 4 Region 1 5 


Region 16 


168.6 


• 


• 




238_1 


I I ! 


51_2 










278_3 








— .. 


276_2 








. 


45.4 






- 




315.8 






; 


. 


154.15 i ; 




j 




228.3 ! 




i 




228_6 


1 




i 




50.1 


1 




| 




112_7 


t 






— 


442.1 


i 


r - 


i 




66_2 I : 




i 




304.2 I 








44_1 




i 


. 




161_4 










46_5 


306-315 








942_1 I ; 




1 




5_4 


393-407 ! 416-426 


456-465 


1 




20_4 


396-405 : 410-419 


461-481 






328_2 


i 




1 

i 




520.2 


i 




i 




771.1 


1 








999.1 






1 J 




853.1 : 








287.1 ! ! 


i 


288.2 ! 






596.2 i 


I 




217.5 ' ! 






217_6 i i 






528.3 i 








171.11 i 








63.4 I 








353.2 ! 








743-.1 ! 


1 


342.4 I 


1 




69.3 ! 


i 




70.6 


1 453-471 506-515 








129.2 


t 296-315 


i 

1 


I 




58.S 


188.3 ' i ! i 


236.6 


358-377 410-423 


i 428-439 


1442-457 467-476 


480-493 


310.8 


. 238-251 256-275 


I 281-290 


1296-310 314-333 


3 , 38-347 


60i_i i ; ! ! 


544.3 i i 




662.1 ! 




87.7 j i 




120.1 
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ORF 


Antigenic] 


Regions (cont) 








Region 1 7 Region 1 8 i 


Region 1 9 Region 20 ; 


Region 21 


_??9i9 n . 22_ 


1 68.6 i 




; ; 


238.1 i : _ ... ' 


51.2 \ 


278_3 i ; , ; 


276.2 ; i ! 1 


45.4 






i 




3 i 6.8 










154.15 




i 1 


228.3 


1 

' ■ ■ 


i 


228.6 ; 




i 


50.1 










112.7 ! 


t 






442.1 




! 






66_2 ■ ! 


i 






304.2 1 








44_1 


1 






161.4 






46.5 


• i 




t 


942.1 




i ! 






5.4 


i 


! • i 




" 20 4 i 1 ! ! 


328 2 i ! 1 


j 


S20 2 1 ; 






771.1 


i 


i 






999 1 ' 1 




i 


853 1 ! 1 




287.1 i 


i 


288.2 ' 


, — r 


—5?6-? i : : U 




217 5 i 




217.6 




T 

i _. 


I 




528.3 


t 


1 


i 




171 11 1 * i 


63 4 s i 1 ' 


353 2 " j i 


743-1 " ' 5 i 


342.4 


i 




i 


j 


69.3 






{ 


i 


70 6 ! ! 


129 2 ' 


58 5 


188 3 




310.8 


357-366 370-379 


^429-438 443-452 


478-487 


551-560 


601.1 










544.3- 










662 1 


87.7 










120.1 


> 
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Table 4 



ORF 


Antigenic 


Regions ! (cont) 


i 






: Region 23 Region 24 ! 


Region 25 i 


Region 26 


Region 27 ! 


Region 28 


168.6 












238.1 


: i 


i 




j 




51.2 




i 








278.3 i 1 




1 




276.2 ! ! 




i 




45_4 


i i 


i 




i 




3 1 6.8 












154.15 




_ 


^ ; I 


228-3 ! 


- • i 




228_6 


' 1 








. 50.1 


1 




i 




1 1 2.7 I 








442.1 




i 




66.2 


■ 1 

1 1 




I 




304.2 ! { 


I 




44_l i i 




■ j 


161.4 


1 










46.5 


; i 










942.1 i 1 ! ! 




5_<- ! 1 






20.4 ! ! 


1 




328.2 ! 1 


1 




520.2 


1 




1 




771.1 i i 






999.1 ! ! 1 




853.1 ! I 1 




287 1 ' i 




288.2 






! 






596.2 






i 






217.5 


t 










217 6 1 




528 3 ! ' 




171 11 ! i 




63 4 5 i 1 




353.2 


I 




i 


i 




743.1 












342_4 










i 


69 3 1 


70.6 










1 


129.2 












58.5 












188.3 












236.6 








i 


310.8 


622-632 670-685 


,708-718 


823-836 


858-867 


877-886 


601.1 


: i i 


544.3 


• 


662.1 




87.7 




120 1 
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Table 4 



w 



15 



20 



25 



30 



35 



40 



45 



SO 



55 



ORF 



Ant ige nic Regions ( cont) 



_ Region 29 Region 30 



168. 6 
238. 1 
51_2 



278.3 
276_2 



45__4 
316_8 



! 

-■i 



154-15 



2Z8_ 3 
228- 6 
J0_j__ 
112_7 



442_1 



66_2 



304-2 



_44_J__ 
161-4 
46_5 



942 _l 



5_4 



20_4 



328_2 



520_2 



771-1 



999_1 



853_1 



287_1 



288-2 



596_2 



217-5 
217_6 



528.3 



171.11 



63_4 



353-2 



743L1 
342-4 



6913 



70_6 



129, 2 
58_5 



188_3 



236_6 



310_8 



601_1 
"S44_3 



662,1 
120_1 
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Table 4 



ORF 




PI ACT 


Antigenic! Regions 


* 








nuMunju 


Region 1 


Region 2 


ttegion <3 


Region ■* 


46.1 


:5241 


aiacnyuc uenyarogenase 


8-1 7 


36-52 


"83-96 


1 1 5.151 
l_ I I C I 


63_4 


5242 


glycerol esier nyuroiabc_^r._ 


9:26 


57-73 


93-107 


1 tJ'l 3 3 


174_6 




5243 ketopantoate hydroxymeth 


71-80 


" 203-2 1 2 


242-254 ! 


_ ?65_-274 


206.16.5244 


ornithine acetyltransferase 


1-10 


34-43 


54-63 " " 


1 OA 510 


267.1 


15245 


NaH-antiporter protein (E. r 


i ?n.i ?q 


332-347 


398-408 ; 




322.1 


15246 


ar-ri f 1 a v/in rpcictanrp orot^in 


58-75 


153-164 


203-231 i 


2 64-28 4* ~~ 


41 5.2 


5247 


transport ATP-binding protr 


108-126 


218-227 


298-308 ; 


315-334 


214_3 


1 524*8" 


2-nitropropane dioxygenas* 


123-136 


216-233 


283-292 ! 




587.3 


15249 


clumping factor 


5-14 


43-54 


59-68 j 




685.1 


; 5250 


signal peptidase 


59-68 


72-81 1 86-95 ! 


99-108 


54.3 


i5251 


fibronectin binding protein 1 


23-32 


37-46 ! 


50-59 ; 


89-98" 


54.4 
54_5 


"'5252" 
[5253 


fibronectin binding protein 1 
fibronectin binding protein 1 


43-52 
49-60 


66-75" 
81-90 


__95-J04 ; 

i 


J47-15V 


54.6 


'5254 


fibronectin binding protein 1 


55-71 


82-97 


139-158 j 

96-105 i 




328_1 


5255 


lipoprotein (H. flu) 


11-20 


61-70 





Table 4 

25 



ORF 




Antigenic: Regions ! 


(cont) f 


! 






Region 5 


Region 6 i 


Region 7 | 


Region 8 ! 


Region 9 


Region 10 


46_1 


215-242 • 


333-352 ! 


376-385 ! 


416-432 i 


471-487 


63_4 


145-154 ; 


191-202 ■ 


212-223 1 


245-265 ! 


274-283 


291-300 


174_6 










1 




206.16 


239-259 j 


275-284 J 


i 


1 






267_1 


i 


i 










322_1 


298-319 i 


350-359 


l 


1 






41S_2 


344-353 


3 71-380 : 


395-404 ; 


456-465 i 


486-495 I 




214_3 


318-337 


"365-375 ' 






1 




53>_3 


: 106-115 


142-151 | 


156-166 


173-182 


186-198 


204-213 


685_1 


: 113-122 


130-145 ! 








54_3 


1 128-138 


18S-194 ! 217-226 


251-260 


268-277 


295-305 


S4_4 


; 175-188 


191-200 


203-212 


220-229 






S4_5 


1 












54_6 


i 220-230 


287-304 


317-326 


344-353 


364-373 


378-387 


328_1 ! 













45 



50 



55 
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Table 4 



ORF 




* Antigenic Regions 


i (cont) 








Region 1 1 


Region 1 2 [ 


Region 13 


i 

"i 


Region 14 


i Region 1 5 


Region 17 


46.1 




i i 










63.4 


306-315 


319-328 : 


366-376 


i 395-420 


453-462 


467-476 


174.6 








t 
1 








206,16 








l 
| 






> 


267_1 








i 




i 


I 


322.1 

415_2 


~ 539-555~ 






I 

! 




i 
1 




~~214 3 ; ' i 1 i . 1 


587.3 


217-226 


278-287 


318-327 


1332-342 


1351-360 I377-386 


685.1 
54l| 


r"316^32S~ 


1 

I" J2V345 _ 


355-372 " 


J 


387-396 


416-425 


1 

^38^448 i " 


54.4 














54__5 i J 










54.6 


! 396-407 


. 427-436 


514-531 




541-550 


{569*578 


!61 2-622 


328_1 






1 




1 1 
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ORF 




Antigenic 


Regions 


(cont) 








; Region 18 


Region 1 9 


Region 20 


1 


Region 21 


Region 22 


Region 23 


46.1 








\ 








63.4 


.485-500 


513-525 




' 1 








174.6 














206.16 








— 








267.1 
















322.1 


i 






_ 








415.2 
















214.3 












1 




587.3 


1396-405 


"I*426~4"42 


,459-470 




485-494 


1 505-51 4 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: Human Genome Sciences, Inc. 

(B) STREET: 9410 Key West Avenue 

(C) CITY: Rockville 

(D) STATE: Maryland 

(E) COUNTRY: US 

(F) POSTAL CODE: 20850 

(ii) TITLE OF INVENTION: Staphylococcus aureus Poly- 
nucleotides and Sequences 

(iii) NUMBER OF SEQUENCES: 52 55 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette, 3.50 inch, 1.4 Mb storage 

(B) COMPUTER: HP Vectra 4 86/3 3 

(C) OPERATING SYSTEM: MSDOS version 6.2 

(D) SOFTWARE: ASCII Text 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: US 60/009,861 

(B) FILING DATE: 05 -JAN- 1996 



(2) .INFORMATION FOR SEQ ID NO : 1 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5895 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1: 

TCCATTATGA AGTCACAAGT ACTATAAGCT GCGATGTTAC CAATGTTTTT TAAAATCCCA 60 

GTAATAAAAT CAAAAAATAA GTTAAATAAT GTATTCATTT TAAGTCCTCC TTAATAAAGa 120 

75 aaataGGTAA TAATGTAATA GCTTCTATTA TGATG CCTAA TTGAATGAAT TGGGCAAATG 180 

GCTCTTTGAT GATAAGTGTG ATAATGAAAA GGGTTAAACT AACAATAATC GCATAATATT 240 

TTTTTCGTTT AATAAGTCGC ACAGGAATGG GCTTCTTTTT AGTTGCTGCA GGAGCATATA 300 

CTGAGATTAC ACCTAAAGAA ATAACTGTTA AAATAAT CAT AATTAAAAAG TTAATATGAA 3 60 

AATTTACTAT TACTAAAGGT AAAAGTATAA ATAGTATAAT ACTTTCTACA TAACACCAAA 420 

AAGAAGAAGG TGCATGTGCa CCATGTGCAT GtCTTCTTAT TAAATAAAAT GTTAAATTCG 4 80 

TAATTAACGT AAACAGAAAA ATGTTTAAAA TATAGGCAAT AGTATACATA ACAATTAATT 54 0 

TACCTATATT TTTAGCTAAG ACCTGCATCC CTAATCGTAC TTGCAAAAAT TGAATATGAT 600 

CTAAGTTATT TCTCTTTTGA AGATACGTGG CAAACTGGTC AATTTTATTA TCAAAATAAT 660 

TCAATTTTAC ACCACTCTCC TCACTGTCAT TATACGATTT AGTACAATCT TTTATCATTA 72 0 

TATTGCCTAA CTGTAGGAAA TAAATACTTA ACTGTTAAAT GTAATTTGTA TTTAATATTT 780 

35 TAACATAAAA AAATTTACAG TTAAGAATAA AAAACGACTA GTTAAGAAAA ATTGGAAAAT 84 0 

AAATGCTTTT AGCATGTTTT AATATAACTA GATCACAGAG ATGTGATGGA AAATAGTTGA 900 

TGAGTTGTTT AATTTTAAGA ATTTTTATCT TAATTAAGGA AGGAGTGATT TCAATGGCAC 960 

AAGATATCAT TTCAACAATC GGTGACTTAG TAAAATGGAT TATCGACACA GTGAACAAAT 102 0 

TCACTAAAAA ATAAGATGAA TAATTAATTA CTTTCATTGT AAATTTGTTA TCTTCGTATA 10 8 0 

GTACTAAAAG TATGAGTTAT TAAGCCATCC CAACTTAATA ACCATGTAAA ATTAGCAAGT 114 0 

GAGTAACATT TGCTAGTAGA GTTAGTTTCC TTGGACTCAG TGCTATGTAT TTTTCTTAAT 1200 

TATCATTACA GATAATTATT TCTAGCATGT AAGCTATCGT AAACAACATC GATTTATCAT 12 60 

SO TATTTGATAA ATAAAATTTT TTTCATAATT AATAACATCC CCAAAAATAG ATTGAAAAAA 1320 

TAACTGTAAA ACATTCCCTT AATAATAAGT ATGGTCGTGA GCCCCTCCCA AGCTCGCGGC 13 80 

CTTTTTTGTA ATGAAGAAGG GATGAGTTAA TCATCATTAT GAGACCCGCC GTTAAAATAT 14 4 0 
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TCATTTGCAA AGGGCGAAAT GGGTTCTTAC TGAGTTATCT ATTATAAAAA AATAAACATA 1560 

GACTTATGAA AAATCTCTCA TAAATCTATG TTTAGTCATG a CATGTGTT A AATATTATTT 1620 

CGGGCGCTTC TTATTTATAC AAATCTAATT TAATACTTTT AAATACAGGT ATATTTTCgC 16 80 

GTTGCTGTTC TACTTCATTT AAGTTTAAAT CTACAGTCAA AATATCTGCG GATTCATTTA 174 0 

ATTCTCCAAC TAAATCTCCA TTTGGGTTTA TAACTATCGA ATGACCAGCA TATTCTGTGT 1800 

TAC CATCG AA TCCAGTGCTA TTAGTTCCAA TGACAAACAT ATTATTTTCA ATTGCACGTG 1860 

C CTTTAGTAA TGAATGCCAA TGTTGAAGAC GTGACATAGG CCATTGCGCC ACATAAAATG 1920 

75 CAATTTTAGC ACCACTACGA GCAGGATATC TTAATAATTC TGGAAAACGT AAATCATAAC 1980 

AGATAAGTTG GGTCACATAA GTACCGTCAG ACAATTGAAA GGGTTCAGCT ACGTATTCGC 2040 

CAGCGGTTAA AAATTCATGC TCTCTTAACA TAGGAACTAA ATGAACTTTG TCGTATTCaT 2100 

on ; * _ 

TAATCAG CTG GCCACTTTTA TTCACACTAA AAGCTGTATT AAATATTTGA TTGTTTCTAA 2160 

TGTTAGAAAC TGACCCAGCT ACGATATCGA CTTTATATTT TTCAGCTAAA TGTTTAATAA 2220 

ATGAAAAACT TTGTCCTAGA TT ATT AT CTG CTTTTTCATT TAAATGCTCT AAATCATAGC 2280 

CATTATTCCA CATTTCAGGT AAAACGACTA CATCTACTTC AGCATTCATA TTTTTTTCGA 2340 

ACCATTGCGT TATTTGAGTT TCATTTTTAG AACTATCTCC AAAAACAATC GGTAATTGAT 24 00 

AAATTTGGAC TTTCATAACA TCACATCCTT GATAGATCTT ATATATAACT TACTAAAAGT 2460 

TATGTTGAAA CGCAAAAAAC GAGCACAAGA CATAAAATCA AAGTC CTAGG CTCTACAAAG 2520 

TTATATTGAC AGTAGTTGAT GGGGCCCCAA CATAGAGAAA TTGGAACACC AATTTCTACA 2580 

35 GACAATGCAA GTTGGGGTGG GCTCTAACAT AAAGAAATAC TTTTTCTTTA GAAATTAGTA 2640 

TTTCTTATAC ATGAGTTTTA CTCATGTATT CCTATTCTTA AGTGCACATT AGCAGCGGCT 2700 

AATGTGTAAG AACTACTACA TAATGAATAA CTAATGATTC TTTATCATTT CTGTCCCATT 2760 

CCTAACAATA TATTGATTAT TTTTTTATTA CGAAACGATC TTCCACTGGA TTAAATGTTT 2820 

TTTCGCCAGC AGCTTCACGA ATATCACCAA ATGGCATTTG AGCAATAAGT TTCCAACTTT 2880 

TAGGAATATT AAATTCATTT GAAGTCATCT CATCAACAAG TGGATTATAG TGTTGTAATG 2 940 

AAGCACCTAT GCCTTTAGTA GCTAATGCAG TCCAAATTGC AAATTGATGC ATGGCATTTG 3 000 

TTTGAGTTGA CCATATTGCA AAATTATCAT AGTAGTTTGG CATTTGTTCT TGTAAACCAC 3060 

50 TTACAACATC TTGATCTTCA TAAAACAAAA TTGTACCGTA TGAATGTTTG AAGTTATCAA 312 0 

TTTTTTGTTC AGTTGGCTCG AAATCACGAT TCTCTCCCAT GACTTCTTTT AAAATTGCTT 3180 

TTGTGTTATC CCAAAATTTA TTATTGTTGT CATTTAACAA GAGAACAATT CTAGTTGATT 324 0 
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CATCGCTAAT TGATATCGAA TCTTTCAAAT TATATATTGA ACGTCTTTCT TC CATTG CAT 336 0 

TGTCAAAAGT CATTGCTTTT TTATCTTTTT TAAATAAGCC CATAATTATT GCTC C TT C TT 34 20 

TAGTAAAGAA TACTTAATAG ACTAAGTATA AAATTTATAC TCGTACTTGT AAAGCAATAT 34 80 

TTACGAAAAT TTCAAGAATA TTAATATTCA TTTTCAAATT CCAAATATAA ATGCATTTTC 3 54 0 

AACGCATATT TATTATACTT AGATTAATAC TTACATGAAA AAGGGAGGTG TCTCGTGAAA 3600 

TGTCATATCA TTGGTTTAAG AAAATG TT AC TTTCAACAAG TATTTTAATT TTAAGTAGTA 3 660 

GTAGTTTAGG GCTTGCAACG CACACAGTTG AAGCAAAGGA TAACTTAAAT GGAGAAAAAC 3720 

,5 CAACTACTAA TTTGAATCAT AATATAACTT CACCATCAGT AAATAGTGAA ATGAATAATA 3780 

ATGAGACTGG GACACCTCAC GAATCAAATC AAACGGGTAA TGAAGGAACA GGTTCGAATA 384 0 

GTCGTGATGC TAATCCTGAT TCGAATAATG TGAAGC CAGA CTCAAACAAC CAAAACCCAA 3 900 

GTACAGATTC AAAACCAGAC CCAAATAACC AAAACTCAAG TCCGAATCCT AAACCAGATC 3 960 

CAGATAACCC GAAACCAAAA CCGGATCCAA AACCAGACCC AGATAAACCA AAGCCAAATC 4 020 

CGGATCCAAA ACCAGATCCA GATAACCCGA AACCAAATCC AGATCCAAAA CCAGACCCAG 4 080 

ATAAACCAAA GCCAAATCCG GATCCAAAAC CAGATCCAGA TAAACCAAAG CCAAATCCGA 4140" 

ATCCAAAACC AGACCCTAAT AAGCCAAATC CTAACCCGTC ACCAGATCCC GATCAACCTG 42 00: 

GGGATTCCAA TCATTCTGGT GGCTCGAAAA ATGGGGGGAC ATGGAACCCA AATGCTTCAG 426 0 

ATGGATCTAA TCAAGGTCAA TGGCAACCAA ATGGGAATCA AGGAAACTCA CAAAATCCTA 4 320 

CTGGTAATGA TTTTGTATCC CAACGATTTT TAGCCTTGGC AAATGGGGCT TACAAGTATA 4380. 

35 ATCCGTATAT TTTAAATCAA ATTAATAAGT TGGGCAAAGA TTATGGAGAA GTTACTGATG 444 0.: 

AAGACATTTA TAATATTATT CGAAAACAAa ATTT CAGCGG AAATGCATAT TTAAATGGAT 4500 

TACAACAGCA ATCGAATTAC TTTAGATTCC aATATTTCAA TCCATTGAAA TCAGAAAGGT 4560 

ACTAT CGTAA TTTAGATGAA CAAGTACTCG CATTAATTAC TGGTGAAATT GGATCAATGC 4620 

CAGATTTGAA AAAGCCCGAA GATAAGCCGG ATTCAAAACA ACGCTCATTT GAACCGCATG 4680 

AAAAAGACGA TTTTACAGTA GTTAAAAAAC AAGAAGATAA TAAGAAAAGT GCGTCAACTG 4740 

CATATAGTAA AAGTTGGCTA GCAATTGTAT GTTCTATGAT GGTGGTATTT TCAATCATGC 4 800 

TATTCTTATT TGTAAAGCGA AATAAAAAGA AAAATAAAAA CGAATCACAG CGACGATAAT 4 860 

SO CCGTGTGTGA TTCGTTTTTT TTATTATGGA ATAAAAATGT GATATATAAA ATTCGCTTGT 4 920 

TCCGTGGCTT TTTTCAAAGC CTCAGGATTA AGTAATTGGA ATATAACGAC AAATCCGTTT 4 980 

TGTAACATAT GGATAATAAT TGGAACAGCA AGCCGTTTTG TCCAAACATA TGCTAATGAA 504 0 
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AATATTAATG AACTTACTGT TGTAGCAATA ATAAATGCCA CGATACGATT ACCTTTAATC 5160 

GCATTAAATA ATTCTCCAAA GATTACTTTT CTGAATACAT ATTCTTCTAA TAAAGGACCA 5220 

ATAATAGATA CAAAGAAGAT AAATATAGGT ATTTTTCGAG CAATAATAAT TAGCTTTTCT 5280 

GTATTAGGAC TTACTTGTTG TCCACCATAA ATTTGCGTTA ATACAATGCT CACTACCATT 5340 

TGATAAATCA TTAC CAATGC AAATCCAAGC AATGCCCATG GAATGATATA TTTTTTAGGT 54 00 

TCTTTAACTT CTAATTCTAA TTTTGTTGGA TTTTTAATTT TTAAATTAAT TAAAATAATC 54 6 0 

GTCGTGGCGG CGATTAAAAA TAGAACAAGT TGTATGTAAA TGACTGCTTT AGTCAGTTCT 5520 

15 ATGCCACTAT ATTGTACAAA TGGTAATTTT TTTACAATGA GAAGCGGTAA AAATTGAGAC 5580 

AATATATAAA TAATAACAGT TAGCAATGAT GCCCATAATC t TGTCATAAT TTTCCTCCAA 564 0 

ATATTTGTTT ATAATTTATT TTATCGTAAA TAACTTGAAG TTACAAAACT TAATTAAAAG 5700 

GTTATGACTT GAAATTTTGA CCAAATTTGA TTATTATAAA TGTATGTTAG CACTCTTTAA 5760 

TGTTAAGTGC TAAACTTTAG GTTTTTTAAG GAGGAACAAT CATGCTAAAA CCAATTGGAA 5820 

ATCGTGTGAT TATTGAGAAA AAAGAACAAG AACAAACAAC TAAAAGTGGn ATTGTTTAAC 5880 

TGATAGTGCT AAAGA 5895 
(2) INFORMATION FOR SEQ ID NO; 2: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6796 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

35 - 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

TTTGAAAAAA CAAGGTACGA TTGGTTTAAT AACATATATG AGAACCGATT CTACACGTAT 60 

TTCaGATACT GCCAAAGTTG AAGCAAAACA GTATATAACT GATAAATACG GTGAATCTTA 120 

CACTTCTAAA CGTAAAGCAT CAGGGAAACA AGGTGACCaA GATGCCCATG AGGCTATTAG 180 

ACCTTCAAGT ACTATGCGTA CGCCAGATGA TATGAAGTCA TTTTTGACGA AAGACCAATA 240 

CCGATTATAC AAATTAATTT GGGAACGATT TGTTGCTAGT CAAATGGCTC CAGCAATACT 300 

TGATACAGTC TCATTAGACA TAACACAAGG TGACATTAAA TTTAGAGCGA ATGGTCAAAC 360 

50 AATCAAGTTT AAAGGATTTA TGACACTTTA TGTAGAAACT AAAGATGATA GTGATAGCGA 420 

AAAGGAAAAT AAACTGCCTA AATTAGAGCA AGGTGATAAA GTCACAGCAA CTCAAATTGA 4 80 

ACCAGCTCAA CACTATACAC AACCACCTCC AAGATATACT GAGGCGAGAT TAGTAAAAAC 54 0 
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AAAGCGTAAC TATGTCAAAT TAGAAAGTAA GCGTTTTGTT CCTACTGAGT TGGGAGAAAT 660 

AGTTCATGAA CAAGTGAAAG AATACTTCCC AGAGATTATT GATGTGGAAT TCACAGTGAA 720 

TATGGAAACG TTACTTGATA AGATTGCAGA AGGCGACATT ACATGGAGGA AAGTAATCGA 7 80 

CGGTTTCTTT AGTAGCTTTA AACAAGATGT TGAACGTGCT GAAGAAGAGA TGGAAAAGAT 840 

TGAAATCAAA GATGAGCCAG CCGGTGAAGA CTGTGAAATT TGTGGTTCTC CTATGGTTAT 900 

AAAAATGGGA CG CTATGGT A AGTTCATGGC TTGCTCAAAC TTCCCGGATT GTCGTAATAC 960 

AAAAGCGATA GTTAAGTCTA TTGGTGTTAA ATGTCCAAAA TGTAATGaTG GTGACGTCGT 1020 

75 AGAAAGAAAA TCTAAAAAGA ATCGTGTCTT TTATGGATGT TCGAAATATC CTGAATGCGA 1080 

CTTTATCTCT TGGGATAAGC CGATTGGAAG AGATTGTCCA AAATGTAACC AAT AT CTTGT 1140 

TGAAAATAAA AAAGGCAAGA CAACACAAGT AATATGTTCA AATTGCGATT ATAAAGAGGC 1200 

AGCGCAGAAA TAATATTTTT ATTTCCTAGA TACATTTTAA GATTGTTAAA TAGAATCATT 1260 

AGTGAATCTT ATTTTAAAGA TAGTAAAGGA TTAATCTAAA TAAGTGCGGA TAATATAAAC 13 20 

ATAACAACAT AATTAAmAGA CATAAATGAC aATAAAAGGA GTATAGAAAT GACTCAAACT 1380 

GTAAATGTAA TAGGTGCTGG TCTTGCCGGT TCAGAAGCGG CATATCAATT AGCTGAAAGA 1440 

GGAATTAAAG TTAATCTAAT AGAGATGAGA CCTGTTAAAC AAACACCAGC GCACCATACT 1500 

30 GATAAATTTG CGGAACTTGT ATGTTCCAAT TCATTACGCG GAAATGCTTT AACTAATGGT 1560 

GTGGGTGTTT TAAAAGAAGA AATGAGAAGA TTGAATTCTA TAATTATTGA AGCGGCTGAT 1620 

AAGGCACGAG TTCCAGCTGG TGGTG CATTA G CAGTTG AT A GACACGATTT TTCAGGTTAT 1680 

ATTACTGAAA CACTTAAAAA TCATGAAAAT ATCACAGTTA TTAATGAAGA AATTAATGCC 1740 

ATTCCAGATG GATACACAAT TATCGCAACA GGACCACTTA CTACAGAAAC CCTTGCGCAA 1800 

GAAATAGTGG ACATTACTGG TAAAGATCAA CTTTATTTCT ATGATGCGGC TGCTCCAATT 1860 

ATTGAAAAAG AATCTATTGA TATGGATAAA GTTTACTTAA AGTCC CGTTA TGATAAAGGT 1920 

GAAGCTGCAT ATTTAAACTG TCCTATGACT GAGGATGAAT TTAATCGCTT TTATGATGCA 1980 

45 GTATTAGAAG CTGAAGTTGC GCCTGTAAAT TCATTTGAAA AAGAAAAATA TTTCGAGGGT 204 0 

TGTATGCCTT TTGAAGTAAT GGCAGAACGC GGACG CAAGA CATTACTATT TGGACCAATG 2100 

AAACCAGTAG GATTAGAAGA T CCAAAG ACT GGG AAACGTC CTTATGCGGT GGTTCAATTA 2160 

so 

AGACAAGATG ACGCTGCTGG TACACTCTAC AATATTGTTG GCTTCCAAAC GCATTTAAAA 2220 

TGGGGAGCTC AAAAAGAAGT CATTAAATTA ATTCCAGGCT TAGAAAATGT TGATATTGTT 2280 

AGATATGGTG TG ATG CAT AG AAATACCTTC ATTAATTCAC CGGACGTATT AAACGAGAAA 234 0 
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TATGTAGAAA 


GCGCAgcTAG 


CGGCTTAGTT 


GCAGGTATCA 


ATCTTGCGCA 


TAAAATATTA 


2460 




GGCAAGGGTG 


AGGTAGTATT 


TCCGAGAGAA 


ACAATGATTG 


GAAGTATGGC 


TTACTATATT 


2520 


o 


TCTCATGCTA 


AAAACAATAA 


GAATTTCCAA 


CCTATGAATG 


CTAACTTCGG 


GTTATTACCA 


2580 




TCTTTAGAAA 


CTAGAATTAA 


AGATAAAAAA 


GAACGCTATG 


AAGCACAAGC 


TAATAGAGCT 


2640 


10 


TTGGATTACT 


TAGAAAATTT 


CAAAAAAACT 


TTATAAAATA 


GTTAGAAAGA 


CTAGATATGC 


2700 




TATTCATTCT 


TAAGTCATCA 


ACGAGTAAGT 


AATGACTTTC 


TAAATGGAAA 


ATACTTATCC 


2760 




TAGTCTTTTT 


AATTTTGGAA 


TTGTTACGTA 


TTTCTGACAA 


TTTAGAATTC 


GCATTCAAAA 


2820 


15 


AATATCTAAA 


TAAATAACAC 


GCAATAAGTT 


GATTGATGTA 


ACATGTAAGA 


GAATGTTTTA 


2880 




AATAAACTTT 


ATTTAAAAGG 


CAATGAAATA 


ATAAATGGCA 


AGGCTATTAA 


TAAAGACTTT 


2940 




TAGTAATTAA 


TTTAAAAAAG 


AGGTATTCTA 


ATTAACAGGT 


TTTCCGATTA 


GTTACAATTA 


3000 


20 


TTTAATTCTC 


AAAAGATTTA 


GAATTGATTA 


TCAAATTACT 


GTAAGCCCTT 


TGCTGTATAT 


3060 




GCTACAATTC 


TTATTGATGG 


AGGGTAAATG 


TATTGAATCA 


TATTCAAGAT 


GCGTTTTTAA 


3120 


25 


ATACATTGAA 


AGTTGAACGG 


AATTTTTCGG 


AACACACATT 


GAAATCATAT 


CAAGATGACT 


3180 




TAATTCAGTT 


TAATCAATTT 


TTAGAACAAG 


AACATTTAGA 


GTTGAATACT 


TTTGAATACA 


3240 




GAGATGCTAG 


AAATTATTTG 


AGCTATTTAT 


ATTCAAATCA 


TTTGAAAAGA 


ACATCTGTTT 


3300 


30 


CTCGTAAAAT 


CTCAACGTTA 


AGAACTTTCT 


ATGAATATTG 


GATGACGCTT 


GATGAGAACA 


3360 




TTATTAATCC 


ATTTGTTCAA 


TTAGTACATC 


CGAAAAAAGA 


AAAATATCTT 


CCGCAATTCT 


3420 




TTTACGAAGA 


AGAAATGGAA 


GCGTTATTCA 


AAACTGTAGA 


AGAGGACACT 


TCAAAAAATT 


3480 


35 


TACGGGATCG 


AGTTATTCTT 


GAATTGTTGT 


ATGCTACAGG 


CATCCGTGTT 


TCGGAATTAG 


3540 




TAAATATTAA 


AAAACAAGAT 


ATAGATTTTT 


ACGCGAATGG 


TGTTACCGTA TTAGGAAAAG 


3600 


40 


GGAQCAAAGA GCGCTTTGTA 


CCGTTTGGTG 


CTTATTGTAG 


ACAAAGCATC 


GAAAATTATT 


3660 


TAGAACATTT 


CAAACCAATT 


CAGTCATGCA 


ATCATGATTT 


TCTTATTGTA 


AATATGAAGG 


3720 




GTGAAGCAAT 


CACTGAACGC 


GGTGTACGAT 


ATGTTTTAAA 


TGATATTGTT 


AAACGAACAG 


3780 


45 


CAGGCGTAAG 


TGaGATTCAT 


CCCCACAAGC 


TCAGACATAC 


ATTTGCAACG 


CATTTATTGA 


3840 




ATCAAGGTGC 


AGACCTAAGA 


ACAGTACAAT 


CGTTATTAGG 


TCATGTTAAT 


TTGTCAACAA 


3900 




CTGGTAAATA 


TACACACGTA 


TCTAACCAAC 


AATTAAGAAA 


AGTGTATCTA 


AATGCACATC 


3960 


SO 


CTCGAGCGAA 


AAAGGAGAAT 


GAAACATGAG 


TAATACAACA 


TTACATGCAA 


CAACAATTTA 


4020 




TGCTGTAAGA 


CATAATGGGA 


AAGCAGCTAT 


GGCTGGAGAT 


GGGCAAGTAA 


CGCTTGGTCA 


4080 




ACAAGTCATC 


ATGAAACAAA 


CGGCAAGAAA 


AGTGCGACGT 


TTATATGAAG 


GTAAAGTGTT 


4140 
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A l l A LAACAG 


TTTAGTGGTA 


ACTTAGAAAG 


AGCTGCTGTT 


GAATTGGCAC 


AAGAATGGCG 


4260 


5 


/vjV»*»VjA 1 AAA 


\-AAlTACGTC 


AATTAGAAGC 


TATGCTAATT 


GTAATGGATA 


AAGATGCTAT 


4320 




1 1 lAvvi IVjTv. 


AGTGGAACTG 


GCGAAGTTAT 


TGCTCCAGAT 


GATGACCTTA 


TCGCTATTGG 


4380 




A 1 v-AGGAGGC 


AACTACGCAT 


TAAG CGCAGG 


ACGTGCATTG 


AAACGCCATG 


CATCGCATTT 


4440 


10 


GTCTGCTGAA 


GAAATGGCAT 


ATGAGAGCTT 


GAAAGTAG CG 


GCTGATATTT 


GTGTCTTTAC 


4500 




CAACGATAAT 


ATTGTTGTCG 


AAACACTATA 


ATAATCAGAG 


CACGATAAAT 


AATTACGAGC 


4560 




AATTAATTTT 


AGTTAAAAGA 


CGGAGGAATG 


AAATTAATGG 


ATACAGCTGG 


AATAAGATTA 


4620 


IS 


ACTCCAAAAG 


AAATCGTATC 


TAAATTAAAT 


GAATACATCG 


TTGGACAAAA 


TGATGCTAAA 


4680 




CGTAAAGTGG 


CAATTGCCCT 


ACGTAATCGA 


TACAGAAGAA 


GTTTATTAGA 


TGAGGAATCA 


4740 


20 


AAGCAAGAAA 


TTTCACCTAA 


AAATATTTTG 


ATGATTGGAC 


CAACCGGCGT 


TGGTAAAACT 


4800 


GAAATTGGAA 


GAAGAATGGC 


CAAAGTTGTC 


GGCGCGCCAT 


TTATAAAAGT 


AGAAGCTACT 


4860 




AAATTTACTG 


AGGTAGGTTA 


TGTAGGACGA 


GATGTTGAAA 


GTATGGTTAG 


AGATCTTGTT 


4920 


25 


GATGTTTCAG 


TAAGATTAGT 


CAAGGCGCAG 


AAAAAATCAT 


TGGTACAAGA 


TGAAGCAACA 


4980 




GCTAAGGCCA 


ATGAAAAACT 


TGTTAAGTTA 


TTAGTTCCAA 


GTATGAAAAA 


GAAAG CGTCT 


504 0' 




CAAACGAATA 


ATCCTTTAGA 


GTCACITITC 


GGAGGTGCAA 


TTCCAAATTT 


CGGACAAAAT 


5100 


30 


aacgaagatg 


AAGAAGAACC 


ACCTACTGAG 


GAAATTAAAA 


CAAAACGTTC 


TGAAATTAAG 


5160 




AGACAGCTAG 


AAGAAGGCAA 


ACTTGAAAAA 


GAAAAGGTAA 


GAATTAAAGT 


CGAACAAGAT 


5220 




CCTGGTGCTT 


TAGGTATGCT 


AGGTACAAAT 


CAAAATCAGC 


AAATGCAAGA 


GATGATGAAT 


5280 


35 


CAATTAATGC 


CTAAAAAGAA 


AGTTGAGCGA 


GAAGTTGCTG 


TTGAGACGGC 


AAGGAaAATC 


5340 




TTAGCTGATA 


GTTATG CGGA 


TGAACTAATT 


GATCAAGAAA 


GCGCTAACCA 


AGAAGCGCTT 


5400 


40 


GAATTAGCAG AACAAATGGG 


TATCATCTTT 


ATAGATGAAA 


TCGACAAAGT 


TGCGACGAAT 


5460 


AATCATAATA 


GTGGTCAAGA 


TGTCTCAAGA 


CAAGGTGTTC 


AAAGAGATAT 


TTTACCTATA 


5520 




CTTGAAGGTA 


GCGTTATTCA 


AACCAAATAT 


GGTACTGTGA ATACTGAACA 


TATGCTGTTT 


5580 


45 


ATAGGTGCTG 


GAGCTiTCCA 


TGTATCTAAG 


CCGAGTGACT 


TGATACCAGA 


ATTGCAAGGT 


5640 




CGTTTTCCGA 


TTAGAGTTGA 


ACTTGATAGT 


TTATCGGTAG 


AAGATTTTGT 


AAGAATTTTG 


5700 




ACAGAACCAA 


AATTGTCATT 


AATTAAACAA 


TATGAAGCAT 


TGCTTCAAAC 


AGAAGAAGTT 


5760 


50 


ACTGTAAACT 


TT AC CG ATGA 


AG CAATTACT 


CGCTTAGCTG 


AGATTGCTTA 


TCAAGTAAAT 


5620 




CAAGATACAG 


ACAACATTGG 


TGCACGTCGA 


CTTCATACAA 


TTTTAGAAAA 


GATGCTAGAA 


5880 




GATTTATCAT 


TCGAAGCACC 


AAGTATGCCG 


AATGCAGTTG 


TAGATATTAC 


CCCACAATAT 


5940 
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AAATATACAA 


AAGGAGAAAA 


ATTCATGAGC 


TTATTATCTA 


AAACGAGAGA 


GTTAAACACG 


6060 


TTACTTCAAA 


AACACAAAGG 


TATTGCGGTT 


GATTTTAAAG 


ATGTAGCACA 


AACGATTAGT 


6120 


AGCGTAACTG 


TAACAAATGT 


ATTTATTGTA 


TCGCGTCGAG 


GTAAAATTTT 


AGGATCGAGT 


6180 


CTAAATGAAT 


TATTAAAAAG 


TCAAAGAATT 


ATTCAAATGT 


TGGAAGAAAG 


ACATATTCCA 


6240 


AGTGAATATA 


CAGAACGATT 


AATGGAAGTT 


AAACAAACAG 


AATCAAATAT 


TGATATCGAC 


6300 


AATGTATTAA 


CAGTATTCCC 


ACCTGAAAAC 


AGAGAATTAT 


TCATAGATAG 


TCGTACAACT 


6360 


ATCTTCCCAA 


TTTTAGGTGG 


AGGGGAAAGA 


TTAGGTACAT 


TAGTACTTGG 


TCnAGTACAT 


6420 


GATGATTTTA 


ATGaAAATGA 


TTTGGTACTA 


GGTGAATATG 


CTGCTACAGT 


TATTGGTATG 


6480 


GAAaTCTTAC 


GTGAGAAGCA 


TAGTGAAGTA 


GAAAnAGAAG 


CGCGCGATAA 


AGCTGCTATT 


6540 


ACAATGGCAA 


TTAATTCATT 


ATCTTATTCT 


GAAAAAGAAG 


CGATTGAACA 


TAT CTTTGAA 


6600 


GAACTTGGCG 


GTACGGAAGG 


CCTATTAATC 


GCATCAAAAG 


TTGCAGATAG 


AGTTGGTATT 


6660 


ACTAGATCTG 


TAATTGTAAA 


TGCACTACGT 


AAATTAGAAA 


GTGCTGGTGT 


AATTGAATCA 


6720 


CGTT CTTTAG 


GAATGAAAGG 


TACTTTCATT 


AAAGTTAAAA 


AAGAAAAATT 


CTTAGATGAA 


6780 


TTAGAAAAAA 


GTAAAT 










6796 



(2) INFORMATION FOR SEQ ID NO: 3: 

3Q (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2073 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
(D> TOPOLOGY: linear 

35 



(Xi) SEQUENCE DESCRIPTION: SSQ ID NO: 3: 
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45 



ATCCTAAAAT 


TnAAAATTAT 


CACGCCTTTT 


GaACAGCTTT 


GTAACCaTCt 


GGACGATCAT 


60 


kAAATTCCaA 


TGTAAATCCT 


GGTTTAAaGT 


TGATCTTTAA CCTTATTTAA AyCACCAATT 


120 


GTACGTATAT 


TATGTTGTTT 


AGCAAAATCA 


CGTTTTACAG 


CTAAAGCATA 


CGTATTGTTA 


180 


TACTTCATTG 


GTTTTAACAT 


AGTCATTTGA 


TATTTCTTTT 


CAAGACTTTG 


CTTAGCTTGT 


240 


TCATAAACTT 


TTTTCTCTTC 


TTTTGACTTC 


AATGGTTCTT 


TTGTTAATTC 


ACCTAAAACT 


300 


GTTCCAGTAA 


ATTCTAAATA 


CCCATCTATA 


TCGTCAGATT 


TTAAAGCATT 


AAATAAAAAT 


360 


GCTGTTTTGC 


CCATACCATC 


TTTCACTTCT 


ACAGTATTTT 


TGGTCTCTTC 


TTCTATTAAA 


420 


ATTTTATACA 


TATTTGTAAT 


AATCGATGGC 


TCGGAGCCAA 


GCTTTCCAGC TAACGTAATT 


480 


TTATCACCTT 


TTTGTGCAAA 


CATAGGAATA 


GCGATAGCCA 


GTATAATAAT 


CATCACTATA 


540 
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TCAAATATAA 


TTGCCAATAA 


GGCTGCTGGA 


ATTGCACCTA 


ATAATATCAA 


CGATGCATTG 


660 


TTACGGTCTA 


TACCTAATAA 


AATTAAATCT 


CCTAGTCCGC 


CTGCACCAAT 


TAATGCTGCT 


720 


AGTGTTGCTG 


TACCTATAAT 


TAATACCATA 


GCCGTTCTTA 


CACCAGCCAT 


TATAACAGGC 


780 


ATTGCTATCG 


GAAGTTCGAC 


TTTAGTTAAA 


CGTCTAAATG 


GTTTCATACC 


TATACCTTTA 


.840 


GCCGCTTCAA 


TGAGTGATGG 


ATCAACTTCT 


TTAATTCCAG 


TATACGTATT 


CCTTAAAATT 


900 


GGTAACAACG 


CATACACTAC 


AAGTGCAATA 


ATTGCTGGCA 


CACGACCGAT 


ACCAAATAAA 


960 


GGAATCATTA 


AACCTAATAA 


TGCCAACGAT 


GGTATGGTTT 


GAAGAATTGC 


CGCAATATTC 


1020 


ATTACGATTT 


CAGAT AT CGT 


TTTAGTCTTC 


GTTAATAAAA 


TACCTAATGG 


TACCGCAATA 


1080 


GCAGTTGCAA 


TCAATAATGC 


GATAAATGAT 


ATTTGAATAT 


GTTCTATCAT 


TGTCGAAAAG 


1140 


AGTTGCCCCT 


TACGTTCACT 


CAATATGTCg AAAAAGTTAG TCATGTTGAG 


CTACCTCCTT 


1200 


TTTCTGGGAC 


AAATATTTGA 


AGATATCTTT 


CCTATCAATA 


ACATATTGAC 


CTACGCTATC 


1260 


TTCTTGGATG 


ACAATGACAC 


GCTCGCTCTC 


TGATAAAAGT 


TGATACAATA 


CTTCAATTGG 


1320 


TTGATTGTCA 


TAAACAATTG 


GATAAGCGCT 


CATAGATGTA 


ACCTCATCGA 


TTGGTTTCAT 


1380 


AATATCCAAG 


TCACGGATAA 


TTGCGTTCTC 


TTCAACACAT 


GGCGCATCAT 


CTTCTAAATG 


1440 


ACTACCCATA 


AATTGTTTAA 


CAAATTCACT 


TTGAGGATTA 


TTTTTAAATC 


CTTCTGGTGT 


1500 


GTCAATTTGT 


TCAATATGCC 


CTTCATTCAA 


AAGACAAATC 


TTATCACCAA 


GTTTCATCGC 


1560 


CTCTTGAATA 


TCATGTGTAA 


CAAATATGAT 


TGTCTTCTTA 


ATTTTAGTTT 


GTAATTCAAT 


1620 


TAAATCATCT 


TGAAGTTTTT 


CTCGGCTGAT 


TGGGTCTAAT 


GCACTAAACG 


GTT CATC CAT 


1680 


TAAAATAACT 


GGTGGATCAG 


CTGCTAACGC 


ACGTATAACT 


CCTACACGTT 


GTCGTTGCCC 


1740 


CCCTGACAAT 


TCATCAGGTT 


TTCTGTTTTT 


ATATTTTTCA 


GGTTCTAATC 


CAACCATTTC 


1800 


AAGTRATTCA 


TCTACTCTTT 


TATCTATATC 


TTTTTCTTTC 


CACTTTTTCA 


TTTGTGGCAC 


1860 


TTGTGCAAtA 


TTTTCTTTGa 


wTGTCa TATG 


TGGGAATAAT 


GCAATCTGCT 


GcAATACGTA 


1920 


TCCAATATCC 


CAACkCATTT 


CGTATACTGG 


ATAATCACTT 


ATTGGTTTAT 


v»l 1 InAAAlA 


i o o n 


AATATAACCT 


TCACTTAAGT 


GAATGAGTCG 


ATTAATCATT 


TTTAATGTCG 


TAGTTTTTCC 


2040 


ACAACCTGAA 


GGTCCAATTA 


GCACAAAAAA 


TTC 






2073 


(2) INFORMATION FOR SEQ ID NO: 4: 











SO <i> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13321 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 





ACTATTCTAG 


CTTCATCAGT 


TATCATATAT TCTTTGAAAC ACTTGTAAGA AAATATAATG 


60 


5 


AGTATTTACT 


ACATAATGAT 


ATTTCAAATT AGAAAAAAGG AAGTTATGAT TTAATGGCCT 


120 




TGAGCCTATC 


ATAACTTCCT 


TTTATCATTT TATTGTTGTG TTGATGTTTC GATAACGTGG 


180 




TACATCTTAT 


CAAACATCAA 


TTCGAAACCA TGCACCATGG CATCATGATA TTCTTTTTTC 


240 


10 


TTTTGCTTGT 


ATTCTAAATT 


AGTAAATCGT CTTTCTTTTT CAACTAATGA ACGATAATAA 


300 




AATAGCATTT 


GGGTGCCACC 


TGTTTCACGT TCAAAAAATT CTACCTCAAT GACATCTTGC 


360 


15 


GTTTCACTTA 


GTCCAGGCAT 


AC CGATAGTC ATCTTAACGT ATTCATCCAT AACTAAAGAT 


420 




TCATAAATGC 


CTTCAATCAC 


ATTTACTTTG CCATTACGTT GTTGATCTAC AATACGATAT 


480 




TTACCGCCTT 


CTTTAACGTC 


CGCTTCAATC TCTTTATTCG TTCTGGCTGA TGTCATAAAC 


540 


20 


CATTGTTTCA ACAAATCTTT 


CTTTGTCCAA GCTTCGTATA CTAACTCTGG AGAAAATTTA 


/- « a 




TAAAGCrri'T 


CAATTTCAAC 


TTCGACATGT TCATTCTCTA CATTAAATTT TGCCACTGTT 


660 




GTCCACCCAC 


TTTCGCTCTT 


ACTTTTATTT TAACGTATTT TTGCTCAGTT CCAAACATAG 


720 


25 


ATGATCATCA 


TTTTTAAAAG 


ATT AG CGTT A TACGGTGAGT ACAACATGAT CTGTTAATAT 


780 




AACAAGCCAC 


CTTACTTGGC 


TACATCGATA TATTGTTAAG CATTAATGTT TCATTTCTTG 


840 


30 


ACTAGTGTTC 


rrrriTAGcr 


TTGGAAAATT AAATAAAATC GCAATAAGTC CGCATACACC 


900 


TAATAATATA 


GGATAAATGC 


TGTATGGGAA TAACATTAAC GGTGAAATAC CAGCTACACC 


960 




AGCCGCTGaA 


ATGACTTGCG 


GGCTATATGG TAATAAACCT TGGAAGCAGC CTCCAAATAT 


1020 


35 


ATCAAGAATA 


CTTGCTGATT 


TCCTTGAATC TACATCATAT TCATCTGCAA TATTTTTAGC 


1080 




TAAAGGACCT 


GACATAATAA 


TAGAGATGGT GTTGTTTGCC GTGGCAATAT CTGCGACACT 


1140 




TACC&AACTA GCAATTCCTA 


ATTCTGCGCC ACGCTTTGAT TTCACTTTAG AGCGAACAAA 


1200 


40 


TTGCAACAAC 


CATTCAATAC 


CACCATTGTG TTGAATAAT A CCGACTAAAC CACCAATTAG 


1260 




CAACGCAATC 


ATAGCAATAT 


CTTCCATGCT TATAATACCT TTGGACACTG CATCTAGTAG 


1320 




CCCCATCCAA 


CCGAATGAAC 


CATCTATGAG ACCAATGATT CCGGCTAATA ATGTTCCGCC 


1380 


AATCAATACG 


ATAATGACAT 


TTACACCTAA TAATGCTAAT ACCAATACTA AGATATACGG 


1440 




TACAACTTTA 


ATTAGATTAT AATCATAGT t TTTAGCATGA TTTAAAGAAA TGCCATTCGT 


1500 


SO 


TAAGAAATAC 


AGAATAATAA TCGTTAAAAT AGCACCTGGC AATACAATTT TAAAGTTTAC 


1560 




TCTGAATTTA TCTTTCATTT TCGTATGTTG TGTTCTAACC GCAGCAATTG TTGTATCTGA 


1620 




AATCATTGAT AGATTATCGC CGAACATTGC ACCTCCAACA ACTGTAGCCa tTGctAGCGC 


1680 
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TCCTACAGAC GTCCCCATAG ATATAGAAAC AAACATACAA ATCACAAACA ATCCTACAAT 1800 

AATTAAATTT TCTGGGATTA ATGATAGTCC TAAATTAACT GTCGACTTTA CGCCACCCAT i86 0 

TTTTTCAGCT GTATTTGAAA ATGCACCTGC TAAAATAAAA ATCAACATCA TTAAAACAAT 192 0 

GTTTGAATGG CCTGCACCTT TCGTGAAGAC CTCAACTTTT TTAGCAAATG ATTCTTTTCG 198 0 

ATTCATTAAT AACGCCACAA TTACCGTTAT CGTAATTGCA ACATTTAATG GCATTGAAGT 204 0 

AAAATCACCT GTGATAATAC CTACGCCTAA AAACAACGCC ACAAATAATA ACAAGGGGAA 210 0 

TAATGCCCAA GCATTGCTCT TTTTATGTAC TTCCATCCTT TTTACCTGCT TTCCAATTAA 216 0 

75 AAATACCTCT TTCTCACAAA CGATGAAGAA AGAGGTTTTC ATGTGCTTTA CCTGCTTATC 2220 

TTCAAACCAT TACGGTTACT GGAATTGGCA CATTCGAGAT GTTGCCGAGG CTTCATAGGG 228 0 

CCAGTCCCTC CACCTCTCTA GATAAGTGAT GCTTATTTAC GTTTACGTTA CAAGATAATC 234 0 

CTTAGTACGT CAATCATAAA TTAATCAGGA GTCGTATAAT ATTTTTCATA AACAATCATT 24 00 

GCTACTGTAA TAATAATCAA AACAATAATG CTAATAACAA GTAAAAGCCA CCATTTAAGC 24 6 0 

ATTAATG CAA TAAAAATGAA CACGATAGAC ACACTTACTA ATATTAATGA TATGACTTTA 2520 

AATTGCTGAA CACGTTGCTT GGAGATGACT TTCAACTGTT TGTTTGATAG ACGCGTATTT 258 0 

TTTATACTGA TTC C CAGT AT ATTTTCTAAT ATTTGAACCA ATACGATACT TATTGCAAAT 264 0 

30 ATAATAATTG GTAAAACATC ATAGCTCCCT ATAGTTAATG TATAAATTAC AAATCCAATG 2700 

TAAAGTAACC CTGAGACAAA GGATAAAAAG TATGCGACGT ATTTGTTAAA CTTAATG AT A • 276 0 

TGCTTTTTAA CGTTTTGATG TGTAAACCAT ACATTCGAAA CGATCGCAAC TGCTACAAAT 2 82 0 

AATGTGAATA CTATATATAA TGGTAATTTT TGTTCAGGAA AAACAGTCGC TATTCCAAAA 2880 

GCTAATGCTA AAATCAAAAA TAATATAGCT CTAGATACTA TTAATGCCAT AATAACAACC 2 94 0 

CCTTTGTTTA ATATCGAGTT TGCAAATTTA CGTTTATCAG CGTTTCTATG ATCAGTACTT 3000 

CTACGGGTAG CGTTTCTATG TAATTTACAT CATCTTAACA TATAAATACT TCGCTATTTA 3060 

ATTGAAAACA TATCCTATTA TTCTTTGTCC GTTCTGACGT TTAATATCTA GCCTTAGGCA 3120 

45 TTTCACTTGT TAATGAATTT AACTTTCTTC CACTAACCGT CCCTAAACCC AATCCCGCAA 318 0 

CAGTTTTTAA CTTTTTCGTT GTTGTCCTGA CATCCTCATT AAGAAAGTTT ATTCTGCTTA 3 24 0 

AAACTTATAA TCCACACCCT GAGCAAACGC TCCTTATGAC AGAGTATTAA AATAAGCCGA 3 3 00 

so 

TAAAGATACA CACCTTTACC GACTATTTAA AATACACTTC ACCAATTCAT TTTAATTTAA 3 36 0 

TGGATTGAAG TAACTAAATT AATATTATGT TGTTCAATTA AAAGCTTCAT ACAAACCTAA 34 2 0 

TCTATTTGCA CTCCACCGCT AACACCGAAC ACTTGTCCGG TTGTATAACT TGATTCTTCT 34 8 0 
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GTTTTTTGAC CAAATGTTGG GATTTTACTT TGAGGTTGTC CACCAGAAAT TTGTAATGGT 3 600 

GACCAGAATG GACCAGGCGC TACACAGTTC ACTCTAATTC CTTTTGGTCC TAATTCTTCT 3 660 

GAAAAACTTT TAGTTAATGA AATAATTGCT GCTTTTGAAG CGGCATAATC ATGAAGAATA 3720 

GGACTAGGAT TATAACCTTG TACAGATGAT GTCGTTGTAA TTGACGCACC CGGTTTTAAA 3780 

TATTCCAATG CTTTTTGAAC TGTCCAAAAT AGCGGATAGA CATTCGTTTC AAATGTTTCT 3 84 0 

GTAAATGCCT CAGTTGTAAA TCCATGAATA T CAT CATG AT ACTGTTGATG TCCAGCAACT 3 900 

AAAGTAACAT TATCTAAGCC ACCTAATTGT TGATATGCTT GTTCAACAAG GTCATAGTTG 3 960 

AACTGTTCAT CTCTTATATC ACCAGGAATT AACACTGCCT TTTG AC CACT TTCTTCAATC 4 020 

ACTTGGCGTA CTTCTTGTGC ATCTTGTTCT TCACTCGGAA GATAGTTAAT CGCTACATCT 4080 

GCACCTTCTT TAGCATACGC AATTGCTGCT GCACGCCCTA TTGCTGAGTC ACCACCTGTG 4140 

ACTAATATTT TATAGCCTTG TAAGCGTTGA TG AC CTTG GT AAGACGTTTC GCCACAATCG 4 200 

GGTGCTGGCG TCATTTCAGA TTGTAAACCC GGTACCTCTT GTTCTTGTTT TTCATAATCC 4260 

GTTGTTTTAA ATTTTGTTCT AGGATCTTGA GCTGCCATTT TTTTACATCT CCTTATTCGC 43 20 

TTAATGGTTA TTATTTACCC AATCTTCCTA GGAACTTAAT CATGATTACA CTAAAAATTA 43 80 

CTTTCTTCTT TATAAAAACA AGCTCGAATT ATT CATG CAA TAGTCTCTTT ACAAATTCAA 444 0 

CAAAATACTC AGGTACTTTT TCCAGAATCC TTTCATCCGG TTTATATTGA GGATGATGTA 4500 

AATCATATTC ACTATGAGAA CCAATTAACG CAAATACACT TGGAAAATGT TGACTATAAC 4 560 

CTGAAAAATC TTCTCCAATC GTAAGCGGCT GTTCCATCAT TCCCACCTTA TATCCAACAT 4620 

35 GTTGGGCTAC TGCAATTGCT TTATGCGTCA ATGCCTCATC ATTCATCACA GCGCCAGGTA 4680 

AATGCGTATA ATTTAAATTA ATTTTCATAT TATATGCTTG AGCCAATCCG TCCGCAATAT 474 0 

CTTGJAATCG TGTTTCTACA AGCTTTCGTA CCACAGGATC AAAACTACGC ACTGTGCCTT 4800 

GTACATACGC ATGATCAGCA ATGACATTCC AAGTATTACC ACATGATATT TGTCCAATTG 4860 

TTACTACCGC TTCATCAAAC GCAGATAGAT TTCTACTAAC TATGGATTGA ATACTATTAA 4 920 

TCAATTGCGC CAACACAATA ACTGGATCGT TGCATTGTTC TGGcTTTGCA GCATGACCAC 4 980 

CCACGCCTTT AATATGAAAC TCAAAACGAT CTACTGCTGA TGTAATTGCC CCTGTTTTGA 504 0 

TTGCAAATGT ACCTACCGAA CGCGATGGGT CATTATGAAA ACCCAATACT GCTTGTACAT 5100 

so CTTTTAATGC ATGTGTTTCA ATAATTTTAA AAGCGCCATG TCCTAGTTCT TCTGCTGATT 516 0 

GAAAAATGAA TTTAACACGC CCAGTAAGAG TGCCCTCAAT TTCTTTTAAT TTTACAGCTG 522 0 

TAG CCAAAAT ACT AG CCATG TGAATATCAT GACCACACGC ATGCATAACA CCTTCATTTT 528 0 

55 



30 



40 



4S 



BNSOOCID <EP 0786519A2_I_> 



228 



10 



20 



25 



EP 0 786 519 A2 

CAGCTATACA ACTCAGACCT TGTCCCACTT CAGCAACAAG CCCAGTCGCA AGTGGTAAGT 54 00 

CTAATATTCT AATATGATGT TCTGTTAAAA TATCTTTAAT TTTTTGTGTA GTCTTAAATT 54 60 

CTTTATCGGA TAGTTCTGGA AATTGATGAA AATACCTTCT CCAGGTAACA GCTTGATCTT 5520 

TTAATCCCAT CGGTCATTCC CCTTCCTTAA GTCAATGATA TGTTGTCTAC CCTACGATGA 55 BO 

TCATCTTTGA CT ATT AAA CG ATGATTTCAC AACAATGTAC TCTTGTTAAT TGCTTTCGTT 564 0 

AATGATAGAC AGTTGTTTAA TAATATCGTA ACACTGTTGT CAAACTATTC TAACTTTTAT 57 00 

AATTGAGACT CTATACAAAA ACGTGTTCTC GAATATACTT GTTTTTACAA ACCACAAAAA 57 60 

^ GCTCTAAACA TTAGTTTAAA CCAATGCTTA GAGCTTTCTA ATTATTTTAT GCTTTAAAAG 5820 

ATACTGTGTT ATCTACGATG ACCTTACCGT CTTTAATAAC TTTTTCTGCG TGATTGATAC 58 80 

CAAAATGATA TGGAATATAT TCATGATTTG GTGCATCCCA AATTACTAAA TTAGCCTTAT 594 0 

CACCTGTGTT AATTGTACCC GCGTTAATGT CTATTGCTTT AGCAGCATTG AC CGT AACAG 6000 

CATTCCAAAC TTCATTAGGT GATAGCTTTA ATTTCAAGGC TGCAATCGCC ATAACAAGTT 6 060 

GTAAGTTGTT TGTGACACTA CTACCAGGGT TATAATCAGT TGCTAATGCA ATCGCACCGT 6120 

TATTGTCAAG CATGCCTCTT GCATCTGCAT AATCTTCTTT ACCTAAATAG AACGTCGTTG 6180 

CAGGTAAGAG GACAGCTACA GTATCACTAT TTCGCAACTT TTCTTTTCCT TTATCACTAG 6240 ' 

30 AAGCTACTAA GTGGTCTG CT GATATTGCTT GTTCATCAAT TGCTAATTCC AGTCCGCCTA 63 00 ' 

ACGGATCAAT TTCATCCGCA TGTATTTTCA CTTTAAAACC TGCTTCTTTG GCTTTTTGCA 636 0" 

TATAATGTTG CGATTGTTCT ATTGTAAATA CACCTGTTTC ACAGAAAATA TCCGCAAAGT 64 20* 

CTGCATATTG TTTTACTTCC GGAAGTAACG CAATCATTTC TTCTAAAAAT GCCTCATTTG 64 80" 

AACTTGCCTC TTTAGGTACA GCATGAGGCC CTAGGAAAGT ATGTTTCATG TCT AAAT CAT 654 0 

ATTTCTCAGC TAAACGATTA GACACTTTCA ATTGCTTCAG TTCATTTTCT CTATCTAATC 6600 

CATAACCACT CTTACTTTCA ACTGCAAGCA CGCCGTGTTT AAT CAT AGT A AGCAAATCAT 6660 

GCTCTGCTTT TTTAAACAAG TCATCTTCGG ATGTTTCTCT AGTAGCATTA ACGGTAGATA 67 20 

ATATGCCACC ACCCATTTCT AATATTTCAA GGTAAGACTT ACCTTGACGT TTTAATGACA 67 80 

TCTCATGTTC TCGAGATCCA CCAAATGTTA AATGGGTATG TGCATCTACT AATGCTGGGG 6 84 0 

ACACTACCTT CCCACTAGCA TCAATCGTCT CAGTCGCATC GTAGTCATCT GTATGTGTTC 6 900 

50 CAGCATATAC AATTTTGCCA TCTTTAATGA CAACTGTACC ATTTTTCACA ACATTTAATT 6 960 

CATCTAATTC CTTACCCTTC AAAGGTTTAT CTGTTGATCT CGGTAAAATT AATTCTGCTA 7020 

TATGATTAAT TATTAAATCA TTCATTACTT ATCACCTGCT TTATCAATCA TTGGAATATG 7080 
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AACACCCATA CCTGGGTCAG TCGTCAATAC ACGTT CCAAT CTTCTTTCAG CACGCTCTGA 7200 

TCCATCTGCT ACAACAACCA TACCCGCATG AAGTGAATAT CCCATGCCAA CACCGCCACC 7260 

GTGATGGAAT GAAATCCATG AACCACCTGC AGCTGTGTTA ATGAGTGCAT TCAATACAGC 7320 

CCAATCACCA ACCGCGTCAC TACCATCTTT CATACTTTCT GTTTCACGGT TAGGACTAGC 73 80 

AACTGAACCA GCATCTAAAT GGTCTCGTCC AATAACAATT GGTGCTGAAA TTTCACCGTC 7440 

ACGTACAAGA CGATTTAAAG CTAAGCCCAT TTTCGCTCTT TCTCCATAGC CTAACCAAGC 7500 

AATACGTGAT GGTAGTCCTT GATATGAAAT TTTTTCTTCA GCTAAATCAA GCCATCTTAA 7560 

15 TAACTTTTCA TTTTCTGGGA AAAGTTTGCG CATTTCTTCA TCCGCACGCT CGATATCTTT 7620 

TGGATCACCA CTCAACGCAG CAAAGCGGAA TGGCCCTTTA CCTTCACAGA ATAATGGTCT 7680 

AATGTAAGCT GGTACAAAGC CTGGGAAGTC AAAAGCATTT TTCACTCCGT TATTGAAGGC 7740 

TACTTGACGA ATATTGTTAC CATAATCAAA TGCTACAGCG CCACGTTTTT GGAATTCAAG 7 800 

CATTAATTCA ACATGCTTTG CCATTGAAGC TTGTGACAGT TCAACATATT TTTTCGGATC 7 860 

TTTTTCACGC AATACTTTCG CTTCTTCTAC AGAGTATGCT TGTGGCAGAT ATCCATTTAG 7920 

CGGATCATGT GCACTTGTTT GGTCAGTAAT AATGTCAATT TTAAATCCTT TTTCTAGAAT 7980 

CGCTTGATGG ATGTCTACAG CATTTCCAAC TAACCCGATT GATAATCCTT CTCCACGTTG 8040 

30 TTTCGCCTCT TCTGCTAATT TTAATGCTTC ATCTAAATCA GCTGTTTTAA CATCACAGTA 8100 

TTTCGTATCA ATTCGCTTAT CAACACGTGT TTCATCAACA TCCACGCAAA TTGCTACCCC 8160 

ATG ATT CAT A GTAATTGGTA ACGGTTGCGC ACCACCCATA CCACCTAAAC CTGCTGTCAG 8220 

TGTAACAGTG CCTGCTAAAT CTCCATTAAA GTGTTGATTA CCTAGCTCGG CAAATGTCTC 8280 

ATAAGTACCT TGCACAATAC CTTGAGAACC AATATATATC CAACTACCGG CTGTCATCTG 834 0 

TCCATACATG ATTAAACCTT TTTTATCTAA TTCATTAAAA TGATCCCAGT TTGCCCATTC 84 00 

AGGCACTAAT ACTGAATTTG AAATTAATAC ACGTGGCGCT TCTTCATGTG TTTTAAATAC 84 60 

AGCAACTGGC TTTCCTGATT GTACTAACAT TGTCTCATCT GATTCTAATT CTCGTAACGT 8520 

TTTCTCTATT GCTTCAAAAG CTTCCCAATT ACGTGCTGCT TTTC CAATAC CACCATAAAC 8580 

AACTAAATCT TCTGGTCTTT CAGCAACTTC TGGGTCTAAA TTGTTGTATA ACATTCTAAG 864 0 

TACTGCTTCT TGTTCCCAAC CTTTACACTC AATACTCAAA CCTTTTTTTG CTTGAATTTT 8700 

50 TCTCATAAAA TTCGCTCCTG TTCTTTTAAG AAGTTAATTC CACTAAATTT AAAACGCTTA 8760 

CATTATTATC TTCAATATTC ATTATAGTAT GTTAAAATAT AGCCAACAAA TATAAATAAA 8820 

CTAATTATCC ATAGCTTGAA TCTATAAATA AAAGGAGCAA AACACATGAA AATTATTCAG 8 8 80 
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CAT ATT AG CC AGCCATCTTT 
GACTTATTTA CACGTT CAAC 
CGTTATGCGA GCGAATTAGT 
AGCGTTACAT CAGAACCAAG 
GCGAATTTAA TTCGAAAGCA 
ATACATGATA AACATCAATC 
ACAAATGAAA AAATAACCCA 
ATTTTATTAG CAC CCAAGGA 
CCACTCATAT TACCAAACAA 
AGAAGAAATA TTCGTCCAAA 
TTTGTTCATC TCGGCTTAGG 
ACGTCTAATT TAGAATATAA 
TACCATAAAA AACGCAAACA 
TATTTATATG GACTTTTAGA 
CTCAGTCAAC TGTATACCTT 
CTCAGTCAAC TGTATACCTT 
CTCAGTCAAC TGTATACCTT 
CTCAGTCAAC TGTATACCTT 
CGTAGTCAaC TGTaTACCTT 
CGCAGATCAT CGTATAAAAA 
TTATAAAAAT TCTAAGAAAG 
TTCAGTCAGA AATTTAAACG 
GCTATTGACT GGTAAAAAAA 
TGAAGAAAAA GATATACCAC 
TATAGATGAC AGCGAATTTA 
TTTAGAAGGA AGAGTTAGAG 
TGGTCATAAG CTAGAATTAC 
GGCTAAACCA CATATGACAT 
CCGTTAAAAT TTTAATTAAT 



AACTG CTACG 
AAAAGACATC 
TCAACAATAT 
GATAAAAATT 
CCATTCCGAC 
TATAGAGCAA 
CGAAGATATA 
AACATTTAAA 
AAATTCTCAA 
TGTCGTTGTA 
TTACGCTATC 
AAAAATTCGT 
CTCCGAACAA 
GGCTCTTTAA 
TTGCCTTTAA 
TTGCCTTTAA 
TTTCCTTTAA 
TTGCCTTTAA 
TTGCCTTTAA 
TTAATGACGT 
AAGTGAAGCA 
ATTCAATACA 
CTGCTTATTT 
GTAATGAAAT 
AATATTGGCA 
ATATTAGAGA 
ATACTGGCAC 
TTTACAAATA 
TATTATATAA 



ATTAAAAAAA 
AAGATTACCG 
CGATCCACGA 
GGGACTCTTG 
TACCCTGAAC 
TTACTGAATT 
AGATCCATTC 
AATCAAAATT 
GTGCGCAAAC 
GAAACAGATC 
ATTCCGAGAT 
C CAAACTTAG 
GTACATACAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CTTAAGTTAT 
CATTTCAAAA 
GATGTTAAAA 
TTTTTATAGA 
TGAGCTTGCA 
TCACTTTTCA 
TCAGAGGTTA 
TAGACAATCA 
ACTTGAGAAC 
AGGTGTCATT 
TAAGAGAACT 



TGGAAG C AG A 
AAAAAGGAAT 
TGGAAAAAAT 
AATCTACGAA 
AGCAATATCG 
TTAATATTCA 
CTTTATATGA 
GGGTAGATGT 
ACTTAGATGA 
GATTCGAATC 
TTTATTACCA 
GCCGAAAAAT 
TCGTACAACA 
TAGAG CCTCT 
TAGAGCCTCT 
TAGAGCCTCT 
TAGAGCCTCT 
TAGTGCCTCT 
TAGAGCCTCT 
ATCGATACAA 
TCTATTAATC 
GATATTTTAC 
GGCCTATGGA 
TATACACATA 
AAAGATAATA 
ATTTACTTTA 
AGATTAAATT 
ATAAAAAGGC 
TTTCAAACAA 



55 



TTTAGGTTAT 
ACAGTTTTAT 
GTATGATTTA 
TCAATGGATT 
TTTATATGAA 
TTT AG CT AT A 
GGAATCTTAC 
TGAAAATTTG 
CTATTTTAAT 
AGCAGTTGGA 
ATCATTTCAC 
TTATATCAAT 
ATGCCAAGAT 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGCAGTTG 
TATGTAGTTG 
TATGCAGTTG 
AAATAATTTA 
ATATATGCTT 
TTGGGAAATT 
TTGCTTTAAA 
TAGCTTTCAC 
ACGTGAATAT 
CCGACCCTGA 
ATTATAAAGA 
CTCTTGAACT 
TACAGTTGTT 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
1O500 
10560 
10620 
10680 
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TTACTG CAAT TATTTTTCAA ATATATCAAC GTTAATATAA CTTCTATTAA GAAATACTCA 10800 

CATTCTGCCC TGCAATGCAA ATCTCGTCAC ATATAAATAT TTTTAATTAT TTTAAAAAAT 10860 

GATGCACTAA ATTAGCAACG AGCTTAGCAG TTGTATTGTC AGCGTCATAT GTTGGATTCA 10920 

TCTCAGCAAT ACTAACTGAA GACACCTTAT CACTTGGAAT AATACGTTTT GCTAATTCAA 10980 

GAACAGTATG TGGATACAAA CCTAACACTG CCGGCGCACT TACCCCAGGC GCAAACGCAC 11040 

TATCAATGAC AT CCATACAA ATCGTAAACA TAATGACATC ATGTTCATGT ACAAAACGTT 11100 

CAAT CAT AT C TTTAATTGTT GGTGATACGT GACTCAATAA TT CAT CTGCA AAGACATAAT 1116 0 

15 CAATCTTTTT CTCTTTAGCA TAATCAAATA AACTTTGCGT ATT AC CACCT TGAGCAATAC 1122 0 

CAAGCACTAA ATAATCTGTG TTTTCATCTT CTTCTAAAAT TTGTCTAAAG CTCGTTCCAG 11280 

ATGTAGATTG TTGTTCAGCA CGTGTATCAA AATGCGCATC AATATTTATC ACACCAATAG 1134 0 

ATTGTGTTGG ATAGACTTTA CGTGTTGCTA AAT ATTGAG C AT ACG CAATA TCATGTCCAC 11400 

CACCTAATAA AAATGTTTGT CTATGATTAG CAATTGACTT CGCTGCAAGC ATAGCAAATT 11460 

CTTTTTGAGT ATCAATTAAT TCCTCATGAT CATGATAAAC ATTTCCGTAA TCGACTAAAG 11520 

TTcACATTGA TTCAAATCCG GCAAACCTGC AAATGCTTGT TTAATCGCAT CTGGTCCTTC 11580 

TTTTGCACCA ATGCGCCCCT TGTTTAAAGC AACACCTTTG TCAACAGCAT AGCCTAATAT 11640 

ACCGACCCCT GATGGCATAC TACTCTTTTC CAGCTTAGAC AAATCTTCAA ATGTTACTGT 117 00 

TTGAAAATGT CTAAATTTTT TCGGGTCTGT TTCACTATCT AACCTTCCAG TCCATAAATT 11760 

TGGTTCACCT TGCTTGTACA CAGCATTTCC CCCTCTTATT TATGTGGCTT ATTAACAATT 11820 

35 AAAGTATAAC GTATAGGAAA TTTTGAATTC AATTCATAGT TAAATCCGTA- TCTTAAAAAT 11880 

ACTTATCTAC ATTACTTTTA CCCCTATTTT CTATGTAATA ACGAATACTT AGCTGATTTA 11940 

TGTTAATAAA AT ACGT CAAG ACTATTACAT TTTCATTAAT ATTGACATAG ACAATTTATC 12000 

TCTCGGCTTG TAATATGTAT AATTGTTACT AAAAGATATT TTGCTTGTTA CCTAATGGAG 12060 
GTTACATATA ATGAAGAACA ATAAAATTTC TGGTTTTCAA TGGGCAATGA CGATTTTCGT 12120 
CTTCTTTGTC ATTACAATGG CGTTATCCAT TATGCTCAGA GATTTCCAGT CTATAATTGG 12180 
TGTCAAACAC TTTATATTTG AAGTTACAGA TCTAGCACCA TTAATTGCTG CAATCATTTG 12240 
TAT ACT CGTT TTCAAATATA AAAAGGTCCA ACTTGCAGGT TTAAAATTCT CAATCAGCCT 123 00 

50 GAAAGTAATT GAACGTCTAT TGCTAGCTTT AATTTTACCT TTAATTATTC TAATTATTGG 123 60 

TATGTACAGC TTTAATACAT TTGCAGATAG CTTTATTTTA TTACAATCAA CAGGCTTATC 124 20 

AGTACCTATT ACACACATTC TGATTGGACA TATT CTGATG GCGTTCGTAG TAGAATTCGG 12480 
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TGTTGTTGGT TTGATGTATT CAGTTTTCTC AGCAAATACA ACTTATGGTA CAGAATTTGC 12600 

TGCTTATAAC TTCCTTTATA CATTCTCATT CTCTATGATT CTTGGTGAAT T AATT AG AG C 12660 

GACTAAAGGA CGTACAATTT ATATTGCAAC GACATTCCAT GCTTCAATGA CATTCGGACT 12720 

TATTTTCTTG TTTAGCGAAG AAATCGGCGA TCTATTTTCA ATCAAAGTCA TCGCCATTTC 127 8 0 

AACAGCAATC GTTGCAGTAG GATACATTGG TTTAAGCTTA ATTATCCGAG GTATTGCATA 12840 

TTTAACAACA AGACGAAACC TTGAAGAACT TGAGCCTAAT AATTATTTAG ACCATGTCAA 12900 

TGACGATGAA GAAACTAATC ATACTGAGGC TGAAAAATCT TCTTCAAATA TTAAAGATGC 1296 0 

15 TGAAAAAACA GGTGTAGCTA CTGCATCAAC GGTTGGTGTT GCTAAAAATG ATACTGAAAA 13020 

TACAGTGGCT GACGAACCAA GCATTCATGA AGGTACTGAA AAAACAGAAC CTCAACATCA 13080 

CATAGGTAAT CAAACTGAAT CTAATCATGA TGAAGATCAt GACATCACTT CGGAGTCAGT 1314 0 

AGAATCAGCm GaATCAGTTA AACAAGCACC ACmAAGTGAC gATTTaACAA ACGATTCAAA 13200 

TGAAGATGAA ATAGAGCAAT CATTAnAAGA ACCTGCGACT TATAAAGAAG ACAGACGTnC 13 260 

ATCAGTTGTA ATTGATGCAG AAAAACATAT CGAAAAAGCT GAAGAnCAAT CTTCAGATAA 13 3 20 

A 13321 
(2) INFORMATION FOR SEQ ID NO: 5: 

30 (i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 8549 base pairs 

(B) TYPE: nucleic acid 

< C ) STRAND EDNESS : double 
(D) TOPOLOGY: linear 

3S 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
ATGTGTTGTA AACTTTTATG TTGAAAAAGC TACTTATCTC AATGAAAACA AGTAG CATTT 60 

40 

AATAAATTAA TTAGTATACA GCTAGTTTTT CTAATTGTTC TTTAACTTGA ATTAAGTTTG 120 
ACCGTATTAG AGAGGCAGAT TGATCCATCG TTTGAATTGC TTGTCCTTCA TTTTCGTTCA 180 

45 AGCCATTACA AACAACTTCA AACTGTTGTG CCATTTGATC AAGACGCGCA TGAGCTTGTG 24 0 

TGTTTAAAAT AAACATATCG TCATAATGTG ATGGCGAATA GATAATTCGT CGTTGTATAC 3 00 

AAACGTATAA AAACCTTGTC ATATCAACGG TTTTGGCATT TTTAAACCTC TGTGTTTTCC 360 

50 ACGCATGTTT GCCCTTATTT AAATAATTTG CCCTTTTTTC GCCCCGAAAA AAAAACACAA 4 20 

AAAAATAACC ACACTCCTAA ATTAATAGGT GGTGTGGTTT TGTTGATTGT AGGGGTATAA 480 
AAATAACCGC ATTATTAAAG ATACGGTTAC TCTGTTATCT GTAAATATAA TAGTAGTTTA 54 0 
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AAACAGGACT 


CCACATAAAA 


ATCAACTCCT 


TTATATACCA 


TAATGATACT 


ATATTTTCTA 


660 




GTTTATTTCA 


ATTTTTCAGT 


TTTTAAAAAT 


GAGTTTCTGT 


TTTTATTTAT 


ACGCTTTTCT 


720 


5 


GTTTTCTTTT 


TAAATTTTAT 


CTTTTTGTTA 


TT CCATT CAT 


TGTAAAATTC 


TATTAAATTA 


780 




ACATAAAATT 


TTTCATGCCC 


TATTTTATTT 


GTTGATGAGA 


TATCAATGTA 


AAGACTCAAT 


840 


10 


ATTGTTTTTA 


AATAGATTTG 


ATGCAACGAC 


TGATAAACCG 


TATTACTATC 


TGCTATGTTA 


900 




TTGGTAAAAT 


GCATAGAAAA 


ATATTCTAAT 


TTATTCATGC 


AATATATATG 


GGTTTCATTA 


960 




TACTTCTTAA 


TGAGTGTATT 


TATACCTTGC 


AATACGTCAT 


TACTTTTAAT 


AACAATTTCT 


1020 


15 


TTTTCACCTG 


TCGAAAAAGT 


CCACTGTTTA 


TCTCCTATAT 


TTTCTTTAAT 


TGTTTTCTTG 


1080 




TTGTCAAATT 


CTAAAATTAT 


AGCCCGTAAA 


CACTCTTCTT 


TATAATTCTC 


GTTCTTGAAA 


1140 




GTACGAAGCA AAATTTTTAT 


AAATTCGGTA 


TTGGTGACTT 


TTTTATAAGT 


GTGATATTTT 


1200 


20 


GCAATCTCTT- TATCAGTAAA 


GAGTGTTCTT 


AGTTCGTGAT 


TATCAAAACT 


T AAATT CATC 


1260 




TTATTCTCTA 


ATTCATTAAT 


TTTATCTTGC 


AAACCAACAT 


TTTCTAAAAT 


TTTCTTGTTT 


1320 


25 


ATCTCCCCTA 


TATCAAAACT 


CCITTTCGAA 


ATTAATTTTG 


AAAACTCGTC 


TGCCATTTCA 


1380 


ACAGCCTTTT 


CTTTCCTTTT 


ATACCTTTTG 


TTAAATTTAT 


GAACCACCGT 


TGCAGCATAA 


1440 




TACGATATCC 


CACCAGATAA 


AATAGATGaT 


ATTATCGGTA 


TGTATATATC 


ACCTTTCATA 


1500 


30 


TTTCCACCTC 


TTTTAACACA 


ATTAAGTATT 


ATGATACACA 


ACTTGCGCAA 


AAAGATGTAG 


1560 




ACAGAACATA 


ATGGCGAACA 


AAAACAACCA 


CCCAGTAACT 


AGTATGGGTG GCGTAgACTA 


1620 




TAACAACTCT 


ATGTTATCAA 


GATATATGTA 


TCGAGTGATG 


GCAAGGAAGA 


AGTCTCCTGC 


1680 


35 


GGGACCAACA 


GTCAGATATA 


TGGCCTCTGC 


CGGGCTATAT 


AGTTCACTCC 


TACTATATAA 


1740 




AAGTAAGTAT 


AACATAAAAA 


GCACCCCGTA 


AACTGTTATA 


CGGGAATGCT 


AAAGTCATAT 


1800 


40 


ATACTACGGG 


GAGTAGTATG 


AAAACTATGC 


TCTCTATCGT 


AAGAAAAAAC 


ACC CAGTGAC 


1860 


ATG CTTGGGT 


GAACAAGGAT 


AGATGTAAAT 


AGTTGATGCA 


TGTGTAcACA 


TCATAACAAA 


1920 




AAACT AG C CC 


GAAGcTAGCT 


ATAACATAAA 


AAAATAGGCA 


AGTACCGAAG 


TACCTGCCAG 


1980 


45 


TTACGCACAT 


TTAAATCTTG 


AGAGTAATGT 


TAAAAAGTGT 


ATAGGAATAT 


TAACATCCAT 


2040 




CCAAATAGTT 


ATTTAATAAC 


TGTAAGATTC 


CCTATAATTA 


ATGTAGCaAA 


ATTTTTATTC 


2100 




TAAGTAAATA 


CTAAATCGTG 


CTAAACTTAC 


CAAAACTACT 


TATTCTATTA 


CCTGCCTTGT 


2160 


SO 


CTACCTCTCC 


TGTCGCTATA 


TAACGACGTT 


GTCCACTATT AGCAATATAA GTAATCCATC 


2220 




TAT AG CCATT 


GATGCAATAT 


GCGCCGTCAT 


ATTTAATTGT 


TGCGTTATTA 


GGTAATACAC 


2280 




CTGTAATTCT 


TGAATTAGTT 


GAATAGCCGT 


CCCTTACGTT 


ATTACCTTTA 


, ACATTGGCAA 


2340 
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CTGGCACTGG TGGATTTTTT TGGTTTTTAG CTGATGTTTT AACATTACCA GCTACCAAAC 24 60 

CACCTATAGG CTTACCATGA ATCGCACCGG CTATTAATTT AGAATACAAG TCATAGTTTT 2520 

TCTTAATCCA ATCCATATCA TTTTTATTAG TAATAAAACC TAATTCAGAT AAACGATAGT 2580 

TTATATTTAT TTCTGCTGAT ACATTAACGT TTAGTAAATC ATTACGAGGT GTTACACCTC 264 0 

TTATTTGTCC TAAGTTATTT TTAATAACAT CTTGTATACT TTTATCAATA GTATCTGCAT 2 700 

TGAATTGACT TGAAATAATA ACATGCCCAC CACTTGCACT TTCTCCTGCT GCGTCTAAAT 2760 

GAATCTCTAG AACAATGTCA TACCCATGTG ATTTAACCCA ATATAAGCCA TAATCTTTAT 2 820 

15 TATTTCCTAC ATTAACACCG TAAG CAGTAT CTTGATACAT ATCTTGTGAT TGACTTGAGC 2880 

CACCATATAA TGCAACTTCG TGACCTGCAT GTCTTAAATA CTTAGCGATA TTTGGTGTTA 2 94 0 

TATATTTACG GATAAAATCA CGTTCATTTG TTCCGTTTCC GACTGCTCCA GGATCGTTAT 3 000 

AACCATGACC GGCTACAAGC ATAATTTTTT TAGGTTTAAT TACTGCTTGC TTTTTGGCAG 3060 

TTGCTTGCTT AATAACGCTT TTAGCTTTAT CTCCAACACT TACTTTATCT GGGAAATTTA 3120 

ATCTAATAAA ATACATTGGG TCATCGTAAT AATGAACATG TCTTGTAACG GTTTCGGGAC 3180 

CCCAACCAGG TTGCGCAACG CCATTTGTCC AACCTTTACC ATTCCAATTT TGGCCAAACG 324 0 

ATGTGAAAGT GTTTAGATTA GCGCTCTCAA CAATTTCAAC ATGTCCaGct CCGCCACCAT 3300 

ACTTTGACGG GAAAACGACA ATGTCCAACT TTTGCGGTAA AAAGCTATCA TAGTTTTTAA 3 360 

TTATTTGCCC GTATTTTTCA ATCCTTGCTT TATTATCAAA TGGAATATTA TAAGCGTATA 3420 

AACCTTGTAA CcTTTCGCCT GTTGCTATCA TAAAAAACAT ATTTGCGTAA TCGTAACACT 34 80 

GAAATCCATA AAACAAATCA GGATTGAACT GCTTCCCTAA TGAATTATCA AACCATTTTT 3540 

CTGCTTGGTT TTTTGTTATC AACATTGGTC AACACCTACC CTAAATCATT TGTGTCGTTC 3600 

ATATTCGTAG GTGTCATTAC TTCTTTAATT GGCGCTTGCC CTGTTGCTTT TCTATACTTG 3660 

TTTTCAGCTT TATATTTCTT TAGCTTTTGA TTTGCCCATT TACCTTCTTG AGATGTTGGA 3720 

TTATCTTTAT ATGTAGTATA TAAAGCAACA ACTGTTAAGA TAATCGATGA AACACTTTCT 3 780 

TCATCTACTG GTATCGGACT TATACCTTTA TTCGCTAAAA ACTGATTGAC TAATGCTAAG 3 840 

ATCAATACGA TGTATCTTGT TATTACTTTT GCATCCATTT GTTTGCTCCT TTTATCCAAA 3 900 

ATAAAAAGCC AGTGCCGAAG CACTGACTCT TAACTATTAC TTACACTTAC TAAACCAGAA 3 960 

50 ACACGACCAA AAG CTATATC CTAAAATTCC CTTAAGCATG GTAATCACCT CCTTTAAATG 4 020 

CCAAAAATAG TTTTTAACAA GGCTATAACA AATGTACTTA GAATCGTCCC TATTAATCCT 4 080 

AGAATCCACA TCTTGATGTC TCTAATATTT TTAGCATTTT TCTCTTTATT TTTTTCATCT 414 0 
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TGCGTTCTCA 


GACTGTCTTC 


TATTCTGTCG 


AATTTTTCAA 


ACATAGTCTT 


ATCATTTTCT 


4260 




TCTAATCGCG 


TtAAACGCCA 


ATCTTGTTCG 


TGTCGTTTGG 


TAAATCCAAA 


CATTACACCA 


4320 


5 


CCCACTTTAT 


TCAAATTAAA 


AAGCCATAAG 


ATTATAACCT 


ATGACTCTAG 


ATTTTCTGGA 


4380 




TACTTTTCTC 


CTGTAATAAT 


TGCATATTCC 


TCTTTATCTA 


TAACTTCCAT 


ATCTACATAC 


4440 


10 


CACGCTATAT 


CTTCTTTACT 


ATATTCTTTC 


AATTGATACC 


ATGTTTTAAT 


ATCTTCGAAT 


4500 


GTTGGTGAAA 


TTAATTTAAG 


CATTTTCAGT 


CTCTCCTTTA 


ACCTCTTCTA 


A'llTTTTATT 


4560 




AAGTGTCACA 


AGTTGTTTTG 


CCATTAGTGC 


ATTTTGCTTA 


TTAACTTGCA 


TCGATAACTT 


4620 


IS 


TGTACTTTGA 


ACAACTTGTT 


TCTGCATACT 


AGCAACCATT 


TTTCGTAAGA 


TGTCATCAGA 


4680 




AGCGACTGTG 


TTTTGTTCTT 


CACTGTCAAT 


CTGTTGATGC 


AAGTCATCTT 


TTTCTTCTGA 


4740 




ATAATCTTCG 


TTAAAAACTA 


TTTCCCCATT 


TGAATATTTA 


AAGGCTTTAG 


GTCTAAAAAC 


4800 


20 


TTGAGAGAAA 


TTTTCTGGTA 


AATTTTCAAT 


AT CAAT AC CT 


TCTTCAAAGC 


CACCAATGAT 


4860 




AGCGTATGAA 


ATTATCTCAT 


TACG CTTGTT 


AACTAATATT 


TGCATTATTT 


TCTCACTCCT 


4920 


25 


ATAATTTTGT 


TAATTGTCCC 


TCTATTTGCG 


TTCGCACCAG 


AGCCTCTTTG 


ACTTCCTAAG 


4 980 


TCGAAATAGA 


CATCGTTTGA 


TATAGTTAAA 


GATGTACGAC 


TAGATTTAGT 


TAATCCAAAC 


5040 




TCATAAACAC 


CTCCACCATT 


TCCATCACCA 


TCTGGAAGAT 


TTGAGGGATT 


CAATGAAATC 


5100 


30 


TTTCCTCCTC 


CAAAAGGACT 


GCCAAACTCT 


GTAAAGTCAC 


CACCTGGAAA 


AGTCCCATAA 


5160 




AAAATTAATA 


AAATAAATTG 


GTCTAAACTC 


TCATTTAAGT 


ACAATGTAGA 


GCCCACACCA 


5220 




TTTGCTGTTC 


CATCAAAAAT 


AACCGAATAC 


CTTTTATTAA 


ACTTGTCATC 


TGCGTATAAT 


5280 


35 


TTAGCGTTAC 


TTTCGGCCAT 


ATTAGCTTTT 


GATTGGGCAC 


TTTGAACAGT 


TTCAAAAGGT 


5340 




GTATTGTAAT 


CATTAATAGC 


TAATTCTGAC 


CACTCAGACC 


ATGAACCCGC 


TTCTTTTCTT 


5400 


40 


TTAACAAATA 


CTTTATTTGT 


ACCGTTCGGT 


CGATAAGTCA 


TACGCTTGTA 


ATCTGAAGTT 


5460 


ACTACTAAAT 


ATTCGACAGT 


ACCGTTAGTA 


CTAACACCTC 


TTGGATAATT 


TATAGCTTGC 


5520 




GAAACATAAA TAAATTGGGT 


TGAATCACCT 


ATTCTTTGTT 


CTGGATTATT 


AAAATCAAAT 


5580 


45 


CCAGTAATCT 


GCATTATCTT 


ACCATCATCT 


TTAGTAATCT 


TAGCriTTTG 


CCAATTTGAA 


5640 




GTAGAACCAC 


TTGTGACTAA 


ACCACCACTA 


TT CACTG ACT 


GCTTGAAGGC 


TTCATGTTTC 


5700 




TCATCCATAT 


ATCGCTTTTG 


CTCATCGAAT 


GTTCTTGAAT 


ATGCTTGCGC 


TTTATTTTCC 


5760 


SO 


AAATCAGATA 


T ATGG CT ATT 


AGCAAGTTGC 


TTTAATTCAT 


CTATACTTGA 


AGATTTTGCT 


5820 




ATTTGAATAT 


CTGATAGACC 


TTTTTCTTTA 


GCTTTTTCAA 


TCAGACTCGC 


ATAATCTTCA 


5880 




CCATTTTTTA 


TAGCCTCGTC 


CATTGCTTTC 


G CACG ATCCA 


TAATAGTTTT 


TTCTAATTCC 


5940 
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TCAACGTTAA ATGTGATAGT TCTCTCGACA ACTAC CACGT CTGAATTACC TAATTCTGCA 606 0 

ACCGAAACTT GAGCTTGATA ACTTCCATCT CGTTTAATTA CATCATTAGG TAATTGAAAT 6120 

TTTAAAATAC CTTTAAATGG ATCTAATATT TCTAGTGGAG CAACTACCAT GACTCCTTTA 6180 

CCTCGAATCG CTATTCGTGC kTTGATATTT tCTTCACTCA ATAATAACGG TTGATTATTT 624 0 

TTAGTGATAT TAAAAAGAAG AACAGAAGAA TCACTCTCTC CTGTTCTAAA AGTTATATCT 63 00 

AGATTTGAAA TATTTCCATA ATGCGCTGTG TTTTCTAAAT TTATAGCTAC AGATTTCTCT 63 60 

AAATTACTCA TTAACTTATA ATTCTCCCTT CGTGTAAAGT CCATGGCCCT GAACTTGTTT 64 20 

15 TACTATCATA ATTTTTCAAT AGTATCTCAG CAGATGCTGT AACACTATTA CGAACTAGCC 64 80 

TATGAACAAA GCCACCTGTG TTTGAAGCTT CTACATATAA GTTCCAACCA GCTACCCCTT 654 0 

TACGTTCAGT TGGAAAATCT GTAAAACGTT TTGTATCATC CGTAGTTAAA TAAAACGACA 6600 

TGC CTACTAT GTTAATATCT GACATTTTTG TGATGAATGA AGGTACTCTC TCCCATTTAC 6 6 60 

CACTATTTTT AGGCACATAA TTCCAGTCCG AAATGTCTCC AGTTCTTCCA GAAAGCACCC 6720 

TTTCAAAAGT CATCATATTC CTTGCATAAC TATTACGCGT CAATATCTGA ATTACATCAC 67 80 

CGCCAGTTTG TGGTGGCTTA ACTTCCAAGA ACCAACCTGC ATCACGCCAT TCTCTTGGTA 6840 

ATGGGAAATC ATCGATTTGA ACTGTATGAT CAGTGTATAA ATAGTAAAGA CCTGGCTCTG 6 900 

TTAACATCCC AAGATTCTTA AGTTTATCAG GCCTCATTGG TAAAGGTTTA ACTCTACCAC 6 960 

CTGTGTCACT CaTGATAAAA GG AACGCCTC TTGAGTGAAG TATTTCTAAA ATACCTCTTT 702 0 

GCCCAATCAT GAAAATACGA TGTGTTCTAT TTCCaTCACC ACCGACAGTA ACACCTAGCA 70 8 0 

TCAAAGCTTT TTTACCACTA TCTTTGTCAT AGTATATTTG CAAACCTTtC TgCTTCCGCA 7140 

AATTCGCCAG GAAATGAATC tAgTGTTCCA CCATAGTCAG CATTAACCTG ATACGCTTCT 72 00 

TCTCCTGTTT CTAAATCGAA AGCCGTTAAA TAGTTTCTAT TATTTGGATT ACTGTCTCCT 72 6 0 

GTATACCAAT ACAAGTATTT TTCATCAAAA GTCACACCCT GCATTGGTTG GGTTTCGTTT 73 20 

GTTAGTCTCA TAGGGATACT GATTTTATGC AAAACTTTAT CAATATTTTT ATCAACATCG 7 3 80 

TCTAAACTTC TTATCTCTAT ATAAnTCATT GAGTTTTCAA GTTCCCACTG ACTTCTAGGT 74 4 0 

CTCTCaATTC TGTATAGAAT TTTATTTTCT TTTTCATTTA TGACAGGGGT GATGTAGGGT 7500 

TTTTCTGGGT GTCCTGTAAA TACATCTTGC ATACCATACT TGCCATAGCT AATTTCCACA 7 560 

50 TTAGGCGTAT ACTTGAAACG AACTAATGTA TTCTCATTAT T AC CATTTAA GATAAAACTA 76 20 

TAAATCCATA ACTCATcATC AATATATCTA TAACCGTTAT GTGTACCATG ACCCCCACCT 7 6 80 

ACAATCAATG AG CTGTCTAT AAATTGACCA TTAGGTCTTA GACGACTTAG CATATAGCCA 7 74 0 
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ATTACTGCAT 


TTGTAAgAGG 


TGCAAGTTCT 


GTCACAAATA 


AAAATTCTTG 


CTTATCAGGT 


7860 


TCAAAACGAT 


ACTCGATATC 


AAGAATTTCT 


TGTTTGGTCT 


TATTTAATTC 


TCTTATAGTT 


7920 


TCCTCTTTAT 


TAATTTGAGT 


TTTGGTTTCC 


CAATCGTCTA 


AATGTTCTTT 


TAATGTGTCA 


7980 


AAGGTTTCGG 


CGTTTACATT 


AACTCGAGCT 


TGAACAATCT 


CATTAGCACT 


GTTATTACGT 


8040 


GGTGCCACAA 


CAAGTGCGTT 


AATTTGACTT 


TGTAAAGATT 


TGTTTACTGC 


TGCTTGCGAT 


8100 


CTACCATTAT 


AATAAATTTG 


CTCAGCGAAG 


TGTTGAATTG 


TTTTAGCTyT 


CTGATGCAAC 


8160 


TTAAACTCTG 


TTGTCAAGCC 


AAGCGCAAAT 


TGCTCTATTC 


TTTGTAAGTT 


TTGTATTTCC 


8220 


TTAGCTCTAT 


AATCTCGACC 


TGCTAAAGCT 


CCCAAATCCT 


TTATTAAATA 


CAAATTTTCC 


8280 


ATAATGCACC 


TTCCTTTCTA 


ATAAAATAGC 


ACTGTACCAA 


GTTTCCCACT 


ATCGTCAACT 


8340 


GTTATTTTCC 


ACAATTTACC 


GTTTGGGGAT 


TTCTGTACAA 


TGCTATTTTG 


AATAATTgcC 


8400 


TGctTCGCCT 


ATTTTTAAAT 


TATCTAATTT 


ATTTkTATCA 


TTT AC CG AAA 


TGATACCGTC 


8460 


TTGAGGCAAT 


CCATCAATAn 


CACTACTGCC 


TGCATAAGGT 


ATCCCATTTA 


TAGCTTTCCA 


8520 


ATGTGTAGCT 


GGAAAGTACT 


GTTTATCGT 




• 




8549 
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(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3601 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 i ne ar 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
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AGGCGTGTAG 


TGACTTACGG 


nTAGGAAACT 


ATGTATCCGA 


ATGATTTATT 


GAG AC CAAAA 


60 


AGGCATTAAA 


GTCCATTGAA ATATCnGGTA GCGmGTTGGT ACgTGGACGT 


GGGGGCCCTA 


120 


GATGTATGAG 


TCAACCATTA 


TTCAGAGAGG 


ACATTTAACG 


TAATAAATTA 


TAGAmACGAG 


180 


GGTGAAAATA 


ATGACAGAAA 


TTCAAAAACC 


GTATGATTTA 


AAAGG CAG AT 


CATTATTAAA 


240 


AGAAAGTGAT 


TTTACCAAAG 


CAGAATTCGA 


AGGACTTATT 


GATTTTGCAA 


TTACATTAAA 


300 


AGAGTATAAG 


AAAAACGGTA 


TTAAGCATCA 


CTACTTATCT 


GGAAAAAATA 


TTGCACTACT 


360 


ATTCGAAAAG 


AATTCGACGA 


GAACGCGTGC 


TG CGTTTACA 


GTTGCGTCTA 


TTGATTTAGG 


420 


TGCGCATCCA 


GAATTTTTAG 


GAAAAAATGA 


TATTCAATTA 


GGCAAAAAAG 


AATCTGTAGA 


480 


GGATACTGCG 


AAAGTATTAG 


GTAGAATGTT 


CGATGGTATT 


GAATTCCGTG 


GTTTTTCACA 


540 


ACAAGCTGTT 


GAAGATTTAG 


CGAAGTTCTC 


TGGTGTACCG 


GTGTGGAATG 


GATTAACAGA 


600 
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TCTAGAAGGA ATAAACTTAA CTTACGTTGG AGATGGACGT AATAATATTG CGCATTCATT 720 

AATGGTAGCA GGTGCTATGT TAGGTGTTAA TGTAAGAATT TGTACACCTA AATCATTAAA 7 80 

TC CAAAAG AG GCATATGTTG ATATTG C AAA rGAAAAaGCG AGTCAaTATG GTGGTyCAGT 84 0 

CATGATTACG GATAATATTG CAGAr c CAGT TGAAAaTwCm GATGCTATAT ATmCAGATGT 900 

TTGGGTATCG ATGGGTGAAG AAAGTGAATT TGAACAcGTA TTAATTTATT AAAAGACTAT 960 

CAAGTGAATC AACAGATGTT TGATTTAACA GGTAAAGATT CAACGATATT CTTACATTGT 1020 

TTAC CAGCAT TCCATGATAC AAATACACTT TATGGACAAG AAATTTATGA AAAATATGGA X0 80 

75 TTAGCTGAAA TGGAAGTTAC AGAC CAAATC TTTAGAAGTG AACATTCAAA AGTGTTTGAT 1140 

CAAGCTGAAA AT AGAATG CA TACAATTAAG GCAGTAATGG CAGCAACATT GGGGAGTTAA 1200 

TCACTAAATG GAACGATATG AATATGATGT GTCTGATGAT ATAAGTGTCA TGTACAGACA 12 60 

CCTCATATTG GTATTAAAGG AGAAATGAAT ATGAACGAAT CAGGAGATAA CAAACTCAGT 1320 

AAATCTTCTT TAATTGGACT AGTTATAGGA TCCATGATTG GTGGCGGTGC GTTCAATATA 13 80 

ATGTCTGATA TGGGCGGTAA AGCCGGTGGA TTAGCCATTA TTATTGGTTG GATTATTACA 144 0 

GCTATAGGAA TGATTTCATT AGCGTTCGTA TTTCAAAATT TAACCAATGA ACGGCCGGAG 1500 

CTAGACGGTG GTATTTATAG TTATGmTCAA GCAGGATTTG GCGATTTTGT AGGATTTATC 1560 
30 AGTGmTTGGG GATATTGGTT CTCAGCGTTT TTAGGCAATG TTGCCTATGC AACACTATTG . 162 0 

ATGTCAGCAG TAGGTAACTT TTTCCCGATT TTTAAAGGAG GCAACACATT ACCAAGTGTT 16 80 

ATTGTCGCCT CGTTACTACT CTGGGGTGTC CATTTCTTGA TTTTAAAAGG CGTTGAAACA 174 0 

35 GCAGCATTTA TCAATAGTAT TGTTACTGTT GCAAAGTTAA T AC CG ATTTT ACTTGTAATC 1800 

ATATGCATGA TAATTGCATT CAATTTTGAC ACTTTTAAAA CAGGCTTTTT CAGTATGACG 1860 

TCAGAGGGTG TATTGCCATT TAGTTGGGCG AGCACAATGA GCCaaGTtAA AAGTACGrTG 1920 

CTAGTGACAG TTTGGGTGTT TATCGGTATC GAAGGTGCAG TAATTTTTTC TAGTAGAGCT 1980 

nAAAATGAGA AAGATGTAGG TAGTGCCACG GTTATAGGAC TTATATCAGT TTTAATTATC 204 0 

TATyTCTTAT TAACTGTATT AGCTCAAGGC GTGATTTTGC AAAATCATAT TTCG CAATTA 2100 

GATTCGCCAA GTATGGCACA GGTGCTTGCA ACTATTGTAG GTGGTTGGGG AT CT ACACTT 216 0 

GTAAATATTG GTTTAATTAT TTCGGTACTA GGTGCATGGT TAGGATGGAC ACTGCTTGCT 2220 

SO GGTGAATTAC CTTTCATTGT TGCAAAAGAT GGATTATTTC CAAAATGGTT TGCTAAAGAA 2280 

AATAAAAATG GAGCACCTGT AAATGCACTG CTTATTACCA ATATATTAGT ACAATTATTT 2340 

TTAATAAGTA TGCTATTTAC ACAGAGTGCG TATCAATTTG CATTTTCACT AGCATCAAGT 24 00 
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CGACAGCAAG 


CAACTACTAA 


ACAATGGACG 


ATTGGTATCA 


TAGCCTCAAT 


TTATGCTATA 


2520 


TGGCTTATAT 


ATGCAGCAGG 


TATCAATTAC 


TTATTATTGA 


CGATGTTACT 


TTATATTCCA 


2580 


GCTCTTCTTG 


TTTATACaAT 


CGkTCmAAAG 


rATwATCAGa 


CACGTTTGAT 


TAAATCAGrC 


2640 


TATATTCt TT 


TTATGATTAT 


t AT CGTACTT 


GCAGTTATCG 


GGTTAATTAA 


GTTATTGATG 


2700 


GGAACGATAA 


ATGTTTTTTA 


AAAGGAGCGA 


CAAAAATATG 


AAAGAGAAAA 


TTGTCATTGC 


2760 


ATTAGGCGGT 


AATGCGATAC 


AGACAACAGA 


AGCAACAGCT 


GAAGCACAAC 


AAACAGCTAT 


2820 


TAGATGTGCG 


ATGCAAAACC 


TTAAACCTTT 


ATTTGATTCA 


CCAGCGCGTA 


TTGTCATTTC 


2880 


ACATGGTAAT 


GGTCCACAAA 


TTGGAAGTTT 


ATTAATCCAA 


CAAGCTAAAT 


CGAACAGTGA 


2940 


CACAACGCCG 


GCAATGCCAT 


TGGATACTTG 


TGGTGCAATG 


TCACAGGGTA 


TGATAGGCTA 


3000 


TTGGTTGGAA 


ACTGAAATCA 


ATCGCATTTT 


AACTGAAATG 


AATAGTGATA 


GAACTGTAGG 


3060 


CACAATCGTT 


ACACGTGTGG 


AAGTAGATAA 


AGATGATCCA 


CGATTTGATa 


ACCCAACTAA 


3120 


AcCAaTTGGT 


CCTTTTTATA 


CGAAAGAAGA 


AGTTGAAGAA 


TTACAAAAAG 


AACAGCCAGA 


3180 


CTCAGTCTTT 


aAAGAAGATG 


CAGGACGTGG 


TTATAGAAAA 


GTAGTTG CGT 


CACCACTACC 


3240 


TCaATCTATA 


CTAGAACACC 


AGTTAATTCG 


AACTTTAGCA 


GACGGTAAAA 


ATATTGTCAT 


3300 


TGCATGCGGT 


GGTGGCGGTA 


TTCCAGTTAT 


AAAAAAAGAA 


AATACCTATG 


AAGGTGTTGA 


3360 


AGCGGTTATA 


GATAAAGATT 


TTGCTAGTGA 


GAAATTAGCA 


ACGCTGATTG 


AAGCAGATAC 


34 20 


CTTAATGATT 


CTTACGAATG 


TAGAAAATGT 


ATTTATTAAC 


TTTAATGAAC 


CTAAT CAACA 


3480 


ACAAATCGAT 


GATATTGATG 


TAGCAACACT 


GAAAAAAtAC 


GCGG CACAAG 


GTAAGTTTGT 


3540 


GGAAGGATCG 


tGTTGCCAAA 


AATAGAAGCT 


GCGtACgtTT 


GTTGAaAG t G 


GGGaAACCAA 


3600 


A 












3601 



(2) - INFORMATION FOR SEQ ID NO: 7: 

40 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 573 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 7: 

CGACACTATT AAATGAATTA GAGCACAATC TAACAAATCA AATTCATTTT TCAAAAGATG 6 0 

AACGACTCAC ACATATCGCT TTAAAGTTAT TCGAAACAAC CGATCCTGTT TCAACAAAGC 12 0 

AACTTGCGCA AGATGTTAAT GTTTCGCGTC GGACAATTGC AGATGATATT AAAATGATTC 18 0 
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TTATTGGTGA GGAAGATCAT TATCGTAAAG CGTATGCACA CTTTATACAT CAATATATGA 300 

AACAAGCTGC AC CTTTTAT A G AGG CGG AT A TCTTTAATTC AGAATCAATC GCATTGGTTC 360 

GCCGTGCCAT TATTAAGACA TTAAATAGTG AAAATTATCA TTTAGTTCAG TCGGCTATCG 420 

ATGGCTTAAT CTATCATATA CTCATTGCCA TTCAGCGTTT AAATGAAAAT TTTTCGTTCG 480 

ATATACCTAT CAATGAAATT GATAAATGGC GACATACTAA TCAGTATGCn ATTGCTTCAA 54 0 

AAATGATAGA AAACTTAGAA CGCAGTGTAA TGT 573 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1221 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

TTGATATTTA TAACGTTATA TTTTAATAGT TCACCTGGAT TATTAAATAA ATAGTCCGCC 60 

~^ _ • 

AAATTTTCTT TTTCTTTATC AATCTGaTkG TAATTAACaC TTTCGaCTTC TGTAGGAATT 120 

CTAATGTCAA CAGAAGCATT GATATAAGCT TGATGTTGCA TGCAATCACA CTCCTAATCC 180 

TTCATmTmAA ACGGAGAAGT AAACCCGTCA CTATTCAAAT TCAATCCTTT TGCCCAATCA 240 

ACAGGCTTAT TCATGATAGT TTCGATTTCC TTAAGTCCAT TTGAACCTCT AGGTATTTCT 3 00 

ACAATTACTT CATCATGGAC ATGG CCAACT ATTTTAAAAC CTAATGCTTC AAGCCTTGCT 3 60 

35 ATAGAAATCG CAAGTAAATC CCTTGCAGTT GCTTGAACAA TATTCTCGAC TAACTTCCCA 4 20 

CCATACGTTT TTAACTTTGA CCATTTACGG TTAAGATCTA ACCCCATAAA TTCAACAACT 4 80 

TGACTACCCC AACTATTTTC ACCAACTAAA GCTTTTGGAT AAGCTAAAGC TCTTCCACTA 540 

GGCAGTTCAA TCATTAGAAA ACCTTTTTTC ATATAAAATC TAAGTCCATG TGTATGATGC 600 

GTCTTTCGGG ATTTTACAGT ATTAATTGCA GCCTCTTGGC AAGCCTTCCA AAAATTAACT 660 

ATGTTAGGAT TTGCGTTACG CCAACT ATCA ACTAAACCTT GTAACTCGTT TTCTTCAATG 720 

CCCATTTCCA ATGCACCCAT TGCTTTTAAA GCTCCAGCGC CACCTTGATA GCCTAAAGCT 7 80 

AATTCGGACA CTTTTCCTTT TTGTCTGAGA GGGTCGCCTT TAGTTATGCT TTCTACCGGT 840 

so ACATTAAACA TTTGAGAAGC CGATGCTTCA TAT AT CTTTC CGTGTGTGTT GAATACATCT 900 

AAACGCCATT GTTCTTTTGC AT AC CATGCT ATGACTCTTG CCTCTATTGC AGAAAAATCA 960 

CTTACTGCTA GTTCATTACC TTCTTCAGCA GTAAATGTCG TCCTAACTAA TTGACTTAAT 1020 
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AGATCTCTTG CTATTTCTAA TTCAGTATCT GAAATATAAT GCTTTGTTAA ATTCTGAAGT 1140 

TGTACACCTC TACCTGCCCA TCTTCCAGTA CCGGCACCGT AAAATTGAAA CAGACCTCTT 1200 

ACCCGTTCAT CACTGCACAT C 1221 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 90 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

TTTTGTTTGG TATGAGGTAG CAATGACGAC GTGTCATTGG TGGAGATTGT AAAAATACAT 60 

20 AATAAAAAGA AGCGGCAATG TATACCGCTC CTTTTTTATA CTACATACCG ATTTTCAACC 120 

ATCTCTTTCT ACTTAGTAAT AAGACAATAG TATTAACTAT AAATAGAAGA ACGAAGAATG 180 

ATACTATATT TATAATTTCA GTAGGACACA TAAATGTTGA CTCGTTATTC AATATTTTTT 24 0 

CTACGGCACG ATACATCGTA TTGCTCGCCT CAAATGGAGC AACGATACCA* AATATATTTT 3 00 

TATTAATGGC AACTAAGATG ACTGAACCAA TCCAATATAC AATGCTGATA CCTAAGCTGA 360 

TTAAAATGTT AGGTGAAACC ATACTAATCG TTCCAACAAC TAAGATATAT TGTAAGATAA 4 20 

CGAGTGAAAA TAAGATTATT AATAGTAAGT AATGTGAGAA ATCCGAATAT ATAATTGAAA 4 80 

TAATAGTGAT ACTTAGAATT ATGAACACTA AACATTCAAA AAATAACACT GCTACCTTTT 54 0 

35 TATAGAAGAA GGTAAAGATA TTATCGCCAA TCAATTTATA AAACAGGATA TTTTTATTCG 6 00 

AATACTCTTT ATTAATAAAA TATGCAATAA CAAATGAAAA TAGTAAGAAC CCTAATTGCG 660 

TTGCAACAGT ATATGAACTG AAGAAAAACT GGCTATAGCT TAAACTTTTA ACTTTGTCTA 720 

T AC CTATTGG TAAAAAATAC CCAAGTAAGA AAAGGAATGT GAATAGCACA ACAAGCGTGT 730 

AAATAATTTT ATTGGAAATA CTTTTTTTAA ATTCTAATTT CAAAGTGGAC ACCTCAATTA 840 

TAAATTAATG TAATCATTTA TGACTTCTTC TTTTGATTGG TACTCTTCTA TTTGAAGGTC 900 

TTTAAAAATA AAGTATTTAC CCGGCAAAGC ACTTAAATCG GATAAATTaT GTGTAATATT 960 

GATAATAGTT TTAGTTTGAT GGCTTTGAAT AAAATCATTT AAAAATTCAT AAATTTCATT 1020 

SO AACTGTTTTC TTGTCTAAAG CGTTTGTAAC TTCATCTAAT ATGATTAAAT CATGATCTTC 10 80 
CAATAAGAAA 

(2) INFORMATION FOR SEQ ID NO: 10: 
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(A) LENGTH: 904 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

10 TTAGGACTAT TTTATCATAT TCATTTAAAT TACGGCTAAA AATTTTAAAA ACGGGGATTA 6 0 

ATATATGGAA TTAAGCTATG AAAGTTAATT G ATACTTG CA TTTTACGCTG ATTTATATAA 12 0 

GAATAACTAT TGTATAGTTT TAAAAACGAA CGTACGTTTG CAGGAGGCGA AATCATTGGC 180 

AATGAATAAA CAAAATAATT ATTCAGATGA TTCAATACAG GTTTTAGAGG GGTTAGAAGC 24 0 

AGTTCGTAAA AGACCTGGTA TGTATATTGG ATCAACTGAT AAACGGGGAT TACATCATCT 300 

AGTATATGAA ATTGTCGATA ACTCCGTCGA TGAAGTATTG AATGGTTACG GTAACGAAAT 36 0 

AGATGTAACA ATTAATAAAG ATGGTAGTAT TTCTATAGAA GATAATGGAC GTGGTATGCC 4 20 

AACAGGTATA CAT AAAT CAG GTAAACCGAC AGTCGAAGTT AT CTTT ACTG TTTTACATGC 4 8p 

25 AGGAGGTAAA TTTGGACAAG GCGGCTATAA AACTTCAGGT GGTCTTCACG GTGTTGGTGC 54 0 

TTCAGTTGTA AATGCATTGA GTGAATGGCT TGAAGTTGAA AT CCATCGAG ATGGTAATAT 6 00 

ATATCATCAA AGTTTTAAAA ACGGTGGTTC GCCATCTTCT GGTTTAGTGA AAAAAGGTAA 660 

AACTAAGAAA ACAGGTACCA AAGTAACATT TAAACCTGAT GACACAATTT TTAAAG CAT C 72 0 

TACATCATTT AATTTTGATG TTTTAAGTGA ACGACTACAA GAGTCTGCGT TCTTATTGAA 7 80 

AAATTTAAAA ATAACGCTTA ATGATTTACG CnwGGgTAAA GAGCGTCAAG AGCATTACCA 84 0 

TTATGAAGAA GGGAtCaAAG rGTTgTTAGT atGTCCAaTG ArGGAAAAGA AGTTTTGCCT 900 

GACG 904 
(2) INFORMATION FOR SEQ ID NO: 11; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11271 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
4 & (D) TOPOLOGY: linear 
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(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

GATTTCTAAA TCAAGATCTG TTTTACGATA ACCATTCAAA CCTTGACGTT CATCTTCTTC 6 0 

AGGTTGATTT TGTTGCTGTG TGTCTTTGTT GTCAGAAGTC GCTACTGTTT TTTTATTATC 12 0 

TGTTTCTTTA GTCATAACAA ACGCCTCCGT TATAAAACGC TATATTTAAT GAT ATGTGAT 180 
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TTAATAAGAC GATTCAGCAA GTTTTAAAGT ATTATTTGAC TATGTTGGAT TAGGCATCTA 300 

GTCCTATAAT ATCACTGACA TTGTCAAAAT GATGATCTTT TAAGTAACGT GCGATG CCTT 360 

TGTTCATTTT CTTAGTTAAA CCTGGGCCTT CAATAACAAG TGATGAATAA ATTTGAATAA 420 

GTGACGCACC GTGACGCATC ATTTTGATTG CATCTTCAGT ACTGAATACG CCGCCTGTAC 480 

CTATAATTAA AAATTCACCA TTTGTTTGCT GATAAgCATa CTTAATCAAT TTTAAATTAC 54 0 

GTTCAAATAA TGGACGACCA CTCAAACCGC CTTCTTCGAC TTTATTAGCA GAAGTTAAAC 600 

CATCTCGTTG TCGCGTTGTG TTTGCTAAGA TGATACCGTC AAATGTCTCA GTAATCGCTG 660 

GTAATAGTGC TTTTAAGCCA TCGAAATCCA TATCAGACGT TAGTTTTAAA TAAATTGGCA 720 

CTGTTACATC ATG TTGTTTT TTAAATGCTG TTAAAGCTTG G CAT AACATT GAAAATTCAT 780 

CTTTATCATG GAAGTTTTGA AGATTTTCAG TATTTGGAGA ACTGATGTTG ACTGTGAAAA 84 0 

20 ATGAAACGTC GTGTTTAAAC GTATCAATAA CCTTTATATA ATCTTGATAA CGCGCTTCAT S00 

AAGGTGTCAT TTTATTCACA CCAACATTGA TACCAACAGG TACTTGATAA GCATTTTTAC 960 

GCAAATGACT TAGTG C TT T G TTCATACCAA TATTATTGAA GCC CATTCGA TTTATCAAGG 1020 

CGTCATCTTC TAATAATCTA AACATGCGTG GTTGAGGGTT ACCCGGTTGA GGTTTAGGTG 1080 

TGATACCACC TAATTCTAAA GCACCGAATC CAAGGTGTTC CAATGCTTTT GGTACTTCGC 114 0 

AAGATTTGTC GAAACCAGCT GCTAAgCCAA TTGGATTGTC GTACGTATTA CCTTGTATCG 120 0 

TTTGTGATAA CGTTGGATTC TTATAAGTAA ATAGTTTATC GACGACTGGG AATAAAACCG. 1260 

GaAACTTTTG TaACGTTTTT AATGCATCGA TAGTTAGTCC GTGTGCTTTT TCGGGTTCGA 13 20 

35 TTTTGAATAA GAAAGGTTTA ATTAATTTGT ACATGAGTAT GCTCCTATTT CATTATATTT 1380 

GAGGCTTACT ATCCTCAACT TAATATATGT GAAATATATT CTTTTAATAG ACTAGCATTT 144 0 

CCATACATAA TTTCCTAGTT AAAACTAAAA AGTTTTGAAA ATTGACGCAA gTTTGAATAA 1500 

40 CGTTTTTAAG ATTAAATCAT CCTAATTAGG CAATATTATA GTATAAAGTA AGTAGATTGG 1560 

AAGGTGTTTG TATGAATGAA CAATGGTTAG AG CATTTACC TTTAAAAGAT ATTAAAGAGA 1620 

TTTCACCAGT GAGTGGTGGT GATGTAAACG AAGCATATCG AGTCGAAACA GATACGGATA 16 80 

CATTTTTCTT ACTTGTCCAA CGTGGACGTA AAGAATCATT TTATGCTGCA GAAATTGCAG 174 0 

GTTTAAATGA ATTTGAACGT GCAGGTATCA CGGCACCTAG AGTAATTGCA AGTGGCGAGG 13 00 

TTAACGGTGA TGCGTATTTA GTGATGACGT ATTTAGAAGA AGGGGCTTCA GGGAGTCAAC 13 60 

GCCAATTAGG GCAACTCGTA GCTCAATTAC ACAGTCAGCA ACAAGAAGAA GGCAAATTTG 1920 

GCTTCTCATT AC CTTATGAA GGTGGCGATA TTTCTTTTGA TAATCATTGG CAAGACGATT 198 0 
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GGCTATGGGA TGCCAACGAT ATCAAAGTAT ATGACAAAGT GCGACGTCAA ATTGTGGCGG 2100 

AATTAGAAAA GCATCAAAGT AAACCGTCTT TATTACATGG TGACCTATGG GGTGGTAATT 2160 

5 

ATATGTTCTT ACAAGATGGT CGTCCGGCGT TATTTGATCC AGCGCCATTA TATGGTGACA 2 22 0 

GAGAATTCGA TATCGGTATT ACAACGGTAT TTGGTGGTTT TACGAGCGAA TTTTATGATG 2 280 

CGTATAATAA ACATTATCCA CTCGCAAAAG GTGCATCCTA TAGACTTGAA TTTTAT CGTT 234 0 

10 

TATATTTATT GATGGTCCAT TTATTGAAAT TTGGTGAGAT GTACCGTGAT AGTGTTGCGC 24 00 

ATTCTATGGA TAAGATTTTA CAAGATACAA CAAGTTAGTT AAGACGTTAG ATTGAGATAA 2460 

15 ATAGATAATA TGCACAGATA TTTTTACAAT GAGAAGCGAT ACAGCTGCCT CAATAAAAAT 2520 

ATTTGTG CGT TTTTATTGTT GGAAAATAAA ATTTTAATCG CTATTGTTAA TTTCTGTAAT 2580 

GTAAAACAAG GTTGAGTTAC AATAAAAGTG ATTTTATAAC TTTTTGTTCA ATAAAATTCT 264 0 

20 

AGGAATGATA CATATTTATT GATACAATAA TTTTGAATAT AATCATAAAA CAATATTTAA 2700 

GTATAATTGA ATGTTTGAAT ATCATATATT GATACAGTTT CTAATAATTT TAAAATAATT 2 760 

TAAATGGAGA GAGGTGTAAA TGATGAGTAC AGTTCAAAGT GATATTTTTA AGACCAATAG 2820 

25 

TGCATCATCA TCTATTAAAA GCGCTGTTGA AACATGTAAT AATGTGTCGA AACCGGATAA 288 0 

AGATGAAAGT ACAACAGTAA GTGGAAATAA TAATGCTCAT AGTGTGATAG ATGATTTGAT 294 0 

3Q GAGTAAGAAT CAATCTGTTG CTGAAGCAAT ACGAACTGCG AGCGATAATA TACAAAAAGT 3000 

TGGTGAGGCT TTTGACCAAA CTGACGTAAT GATTGGTAAT GAAATTGGTA AAAATTAAAA 306 0 

CGTGGTGAAA TGATGTCGAA TAAACTGGAT GAAATCAATA AAATAATCAC AG CGAAACAT 3120 

35 GAGCAAATGG ATGACTTATA TGATGAAAAG CGAGAGGTTA AAGCATTGAT AGATGAAAGT 3180 

GATGCGCTTA AT CATTCG AT AGATCAATTA TATCAACATT TAGGTGAGCG TTATTATAGT 324 0 

AGCAATATGG CTAGTCGTAT GGAACAGTTC CGCGATGAAT TTCATTTTGC GAAACGACGT 33 0 0 

40 

TCAACGGAAG CGTTATACGA GCAGCAACAG CAAATTCAAC ATGGCATTCG TAAAGTGGAA 336 0 

GAAGAGATGA TTGACTTGGA AATGCGAAGG AATGTTGAAA TTGAGACGGT GACAAAGGAG 34 2 0 

GAAAATAAAT GGAAACAATA GGAAGCATTA TTTATTTAAA AGAAGGTTCG CAAAAGTTAA 34 80 

45 

TGATTATTAA TAGAGGmCCA aTTGTAGAAA TTGAAAATCA AAAGTATATG TTTGACTATT 354 0 

CTGCATGTAA ATATCCGATT GGTGTTGTAG AAGATGAAAT TTATTATTTT AACGAGGAAA 36 0 0 

so ATATAGATTC AGTTATTTTT AAAGGTTATT CTGATCAAGA TGAGGTTAGA TTTCAAGAGT 366 0 

TGTTTGAAAA TATGAAACAA AATTTGGATA GTGAAATACA ACGTGGAGAA GTTACACAAC 3 72 0 

AATAAAGAAA TACTTTTTCT TTATTGGGGT GGGACGACGA AATAAATTTT GTAAAAATAT 378 0 
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ATGTCATTCA TAATCATTTG AACTAAACGT AGCAGCCTTA AATTTTAAAA AAAGACACAT 3900 

ACCAACTTCC GAAATGTAGA TGAATTCTCT ACAATAACGG AAGTTTTTCT TTTAATATTG 3 960 

AAATTTCTCA AGGATAGGTC TATACTTTAT AAATCGTAAT TATTACGATT TATAATCAAA 4 020 

AACAATAACT TGAAATAGAT CATTGAGGGA GTGTTAATAT GCAACATCAT AAAGTGGCTA 4080 

TTATcGGTGC CGGTGCTGCA GGTATAGGTA TGGCCATTAC CTTAAAAGAT TTCGGTATAA 4140 

CAGATGTCAT TATTTTAGAA AAAGGAACAG TAGGACATTC ATTTAAACAT TGGCCGAAAT 4 200 

CGACCCGTAC GATCACGCCA TCATTTACGT CTAATGGATT TGGCATGCCT GATATGAATG 4260 

,5 CAATTTCCAT GGATACTTCA CCAGCATTTA CATTTAATGA AGAACATATT TCCGGAGAAA 4 320 

CATATGCTGA ATATTTACAA GTGGTTGCCA ACCATTACGA GCTGAATATC TTTGAAAATA 43 80 

CAGTTGTCAC AAATATATCT GTAGATGATG CATATTATAC GATTGCAACG ACAACAGAGA 44 40 

TATATCACGC GGATTATATC TTTGTCGCAA CAGGTGATTA TAATTTCCCT AAAAAgCCAT 4 500 

TTAAATATGG TATTCATTAT AGTGAAATTG AAGACTTTGA TAACTTTAAT AAGGGGCaAT 4 560 

ATGTGGTTAT CGGAGGTAAT GAAAGTGGCT TTGATGCTGC ATATCAACTT GCAAAAAATG 4 620 

GCTCTGACAT CGCACTTTAT ACTAGCACAA CCGGTTTAAA TGATCCGGAT GCTGATCCTA 46 80 

GTGTTAGATT GTCACCTTAT ACACGTCAGC GACTAGGTAA TGTCATTAAG CAAGGTGCTC 474 0 

GCATCGAAAT GAATGTACAT TATACAGTTA AAGATATTGA TTTTAACAAT GGACAGTATC 4 800 

ATATCAGTTT TGATAG CGGA CAAAGTGTGC TTACACCTCA TGAACCAATA CTAGCAACTG 4860 

GCTTTGATGC AACAAAAAAT CCAATCGTTC AACAATTATT TGTGACAACA AATCAAGATA 4 920 

35 TTAAATTAAC AACACATGAT GAATCGACAC GTTATCCGAA TATTTTTATG ATTGGTGCAA 4 980 

CAGTTGAAAA TGATAATGCC AAATTATGCT ATATCTATAA ATTTAGAGCG CGATTTGCAG 5040 

TACCTGCACA TCTTTTAACA CAGCGGGAAG GcTTACCAGC TAAACAAGAT GTCATTGAAA 5100 

ATTATCAAAA AAATCAAATG TATTTAGATG ATTATTCATG TTGTGAAGTG TCATGCACAT 5160 

GTTAGAAGTG AAATATGATA TGAGAACTGG GCATTATACG CCCATACCTA ATGAACCTCA 5220 

TTATTTGGTT ATTAGTCATG CGGATAAACT TACCGCAACA GAAAAAGCGA AATTAAGATT 5280 

ATTAATCATA AAACAGAAAT TAGATATTTC ATTGGCAGAA AGTGTAGTTT CTTcGCCTAT 534 0 

AGCGAGTGAA CATGTGATAG AACAATTGAC ACTATTTCAA CATGAGCGAC GACATTTAAG 5400 

ACCTAAAATA AGTGCGACAT TTTTAG CCTG GTTGTTGATA TTTTTAATGT TTGCATTGCC 5460 

AATCGGTATC GCTTATCAAT TTTCAGATTG GTTTCAAAAT CAGTATGTGT CAGCATGGAT 5520 

AGAATATTTA ACTCAAACAA CATTGCTCAA TCACGATATA TTACAGCATA TATTATTTGG 5580 
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ATTGATTAGT TTATCAACTG CTATAATTGA TCAAACAGGA CTCAAATCAT GGATGATATG 570 0 

GG CAATTGAA CCGTCAATGT TATGGATAGG ATTACAAGGT AATGATATCG TG CCACT ATT 576 0 

AGAAGGGTTT GGATGTAATG CAGCAGCTAT TTCACAAGCA GCACACCAAT GCCATACCTG 58 20 

CACGAAGACA CAGTGTATGA GTTTAATAAG CTTTGGTAGT TCTTGTAGTT ATCAAATAGG 58 80 

TG CG AC ATT A TCTATTTTTA GTGTAGCTGG AAAGTCATGG CTATTTATGC CGTACTTAAT 594 0 

ATTAGTACTT TTAGGTGGCA TCTTACATAA AGGATATGGT TGAAAAAGAA TGATCAACAA 6000 

CTTAGCGTTC CGCTACCTTA TGATAGGCAA TTACATATGC CAAATATACG TCAAATGTTG 606 0 

is CTACAAATGT GGCAAAATAT ACAAATGTTT ATCGTTCAAG CGCTACCTAT TTTTATCACA 6120 

ATCTGTCTTA TTGTTAGTAT TTTATCACTA ACGCCAATTT TGAATGTTTT ATCACAAATA 6180 

TTTACACCTA TATTAT CGTT ATTAGGCATC TCGTCAGAAT TGTCACCAGG GATTTTATTT 624 0 

T CAATG ATT C GAAAAGACGG CATGCTCTTG TTTAATTTGC ATCAGGGCGC CTTATTACAA 63 00 

GGAATGACAG CAACACAGTT ACTACTACTT GTGTTTTTTA GTTCAACATT TACAGCGTGC 63 60 

TCGGTCACAA TGACGATGCT TTTGAAACAT TTAGGTGGTC AGTCAGCACT AAAATTAATT 64 2 0 

GGAAAGCAAA TGGTGACATC ATTGTCTTTA GTTATTGGTG TAGGCATCAT TGTTAAAATA 64 80 

GTAATGCTGA TTATTTAAAA AAAATGAACT ATAACTGAAT ATAGAGTCAT GTCAGTCAAT 6 54 0 

AGGAGATCTA TCTTGGAATA TGCTATTCAT ATGAAGTATA AGAGGAGAGT CGCAGATGAA 6600 

AATAGTTATT ATAGGTGGGT TTTTAGGTGG CGGTAAAACG ACTGTCTTAA ATCATTTGCT 66 6 0 

CGCTGAATCA TTAAAGGAAT CGCTGAAACC AGCAGTCATC ATGAATGAAT TTGGGAAAAT 672 0 

35 GAGTGTTGAT GGTGCCTTAG TATCTGAAGA CATACCTTTA AGTGAACTGA CAGAGGGGTG 67 8 0 

TATCTGTTGT GCAATGAAAG CAGATGTATC AGAACAGTTA CATCAATTAT ATTTAAAAGA 6 84 0 

GCAACCAGAC ATTGTATTTA TTGAATGTAG TGGGATTGCA GAACCGGTCT CTGTCTTAGA 6900 

TGCTTGTTTA ACGCCTATTT TAGCTCCGTT TACAACAATT ACACATATGA TTGGTGTAAT 6 960 

AGACGCAAGC ATGTATAAAC ACATTAAATC ATTCCCTAAA GACATCCAAG GCTTATTTTA 702 0 

TGAGCAATTA GCATATTGTT CTGTCTTATT TGTTAATAAA ATAGATTCAG CAGATGTTGA 708 0 

AACAACGAGC AAACTATTGA AAGATTTAGA AGTTATTAAC CCAGAGGCCG ATATACAAGT 714 0 

CGGTATG CAT GGCAGCGTCA CTTTGCCAAT ATCAGTTAGA CAAATGACAG CAACTTCTGA 72 0 0 

SO CAATAAACAT AAGTCTTTAC AT CAAATG AT TAATCATCAA TTTGTGCAAT CACCAGTCAA 7 260 

ATGTACTAAA GCAGAGTTTA TAAAACGTTT AGCATGCCTT CCGTCTCATA TTTATAGGTT 7 3 20 

GAAAGGGTTT ATGACATTTG AAGACACCGC ACATACGTAT CTCATTCAAT TTACACAAGG 73 80 
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CGGAAAGGGT 
GAGAATGGTT 
TAAAAATAAG 
GAGAACAATC 
AATATAACCT 
ATAAAAGTTA 
TTTATCGCTG 
TTCAAATTCT 
AGCAGGTTGA 
TTTCATGCCT 

TTTAACGACT 
ACCTAATATA 
GTCGCGTTTG 
TAACTCGCTT 
CTGTTCTGCA 
CACGGGTCCT 
CATATTCAAT 
TTGTATAACT 
CTAATTTTTC 
CTTCTTTCTT 
AATTATTTTT 
CCATACGTTC 
TTGGATAACG 
TGTGCCAACC 
TATAGTATGA 
CTAAATTTTC 
GTGCACGTCC 
CTCGCGCTGC 



ATTTCAAAAG 
AACATGCCTT 
CTTGGTCAGC 
AATTAACCCC 
AAGTGACCGC 
ACATCTTGTG 
TATTTTGTGA 
GACCAGAACA 
G ACAT CATTT 
TTTTCTAAGC 

TCAGGGTAAT 

TAGACAGGTT 

ACACGATAAT 

TCTCCATAAT 

AGAGGCAGAA 

TGTCCGACTT 

GACCTCCATT 

TATTTTCTCT 

AGGCTCAATA 

GACTAAATCA 

AAAGTAATAG 

TAAGAAGAAT 

ATCAAAAATA 

ATAACCAAAA 

TTGATAAATG 

AGCTGTTTTG 

CATAATGAGC 

GGCTTCAGGC 



AAGACTATCA 
CATGTATAAT 
CATCAAATAT 
ACATATTTAA 
CTGGAATATC 
GGAAGGAGCC 
AATCATCCAA 
TCGTACGTTT 
TTGCATCAAT 

CTTCTGTTAA 
r*TTi a a n aTr.T 



CTTTTAACAC 

CATCACTTAA 

CACTGTCAGG 

CACGACGATC 

AAATGTCTCC 

GGTGGTATCG 

TGTTAATTGT 

TTTTCTTCAT 

TATGGATAAT 

AACTGTGGCT 

CTTACAGGGT 

GGGATAAACT 

CCAGATAATA 

CAAGCAAATG 

TCACTGTTAA 

AAAATAATGT 

GCACCTTTGA 

TCATTGATAG 



ATGTTTGGAA 
AACGAGTTGA 
AATTTGAAAA 
TAATACATCA 
TACAATAGGT 
TCTAGAATCT 
AGTAATATCT 
GTACTGTTCT 
TGGTGCGATA 
AATTTGATGC 
ATTAATTGGT 
ATGCATCGCA 
TGACTTTGCA 
GTTTGAAGCG 
AACGGCTACA 
GGTACCGTTT 
TAATTTAGCG 
TAGGTGATAA 
CTGTTAAACC 
CAGCAGCATA 
TCGTTAACAT 
GGTTCAAATG 
CACCCCAATG 
CTAGATGTAT 
TTGCCGCAGT 
CTGGCGCGGG 
CATATTTGTC 
ATCCTAAATC 
GTAAAGTTGC 



CAGTAGTGTT 
TTTGAACGTT 
CTGTCCAAGC 
GCAAAGCCTT 
ATGCCAGTTT 
GTCCCATTTA 
GAATGCGTAT 
ATACGTCCTT 
TTTAATGTTT 
ACAATGTCAT 
GGTTCGTGAA 
ACGATTGAAC 
AGTTCGGCAA 
GAATCAGGGA 
ACAGTAAAAT 
GCACCAGGAA 
CCTTGTAATT 
ACCTAATAAT 
CAGTTCATCT 
AAGAATTCTA 
GCCACTCGGT 
TTCAGCGAAT 
TCCAATAATC 
TGTATGAATG 
TACTTCAGGA 
ATGTAGATAA 
TTGATCAAGA 
ATTGATGCAA 
AAAGC CTACA 



TTCAGTGGAA 
TAAGCGTAAA 
TGTTTTATTA 
CAGGTTTTTG 
CTTTATTTAT 
GTAGGGTGAT 
ATTGTCTAAT 
CTTCAGTATC 
CGCCAAATGT 
CATTTTTATC 
A.TGCA ATCTT 
CTGAACTTGA 
TGTCCTGTGC 
GTGGTTCAGT 
GGTCTTTTAA 
TAAAGATGAG 
CTAAAGTTTC 
TTAGCAC CAT 
AAAAATACAC 
TCAATACCTA 
GTGAT ATAAA 
AAAGCTTCAT 
ATATTTAACT 
CCGACATCAA 
TAATTTCCTT 
ATCGGTACGT 
AAACCATCTT 
CGTTCGAATT 
AAGCGATTGG 



7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 
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1 V. IVlnWwiM^ 


'M-Hnnl 1 IvjAA 


PP A P A A PP A T 

GGAGAACCAT 


TTCGATAAGA 


TAAGACTTGA 


ATTTGAACGT 


9300 






V HH Ali"W\Al 1 ob 


A1ALGTTCAT 


CATGATGTGA 


TAATTCGTCG 


GCATTTGTAA 


9360 


5 




11111 Cnnvju 


CC 1 1 C IAACA 


TTACTTTCAT 


CGGTACACCT 


TTAGGATCTG 


9420 






All V— A 1 Lb 1 1 


1 L 1 1 1 1 ILrAA 


TAT CTTCAAT 


GACATAATGT 


TCTTCAAACG 


9480 


10 


1 AA 1 AL X 1 1 1 


& ^^^r* i ^p ^ ^^^^^^^^ 

LA ill Ac i rc 


f t "PPP A T A T 

G CCTCCATAT 


TGTATTGCAT 


GTTTATTGCA 


TCTATTGCAG 


954.0 




nnbun i 1111 


1 A 1 A 1 ALL 1 L 


T A A TTTPH A T 

1 AA1 1 1CAAT 


GTTTGTAACA 


TAAAATTGAT 


CTACCAAGGC 


9600 




AIL. 1 v_ 1 Lv-ftl 


v-oUUil 1AA1 


A A A TV^TTi PPT 

AAATGTACCT 


G 1 1GGGC CAT 


CTGCACCAAT 


CGTTG CTAAT 


9660 


15 




LjC 1 1 VjAI XlC 


TTCAGAAACG 


TGTTTGGAAT 


TATTACTAAA 


ATCACCAACT 


9720 




A A A TP TN ^ T A T 

AAATCAGTAT 


1 1GTAGCGCC 


TGGAT CAG CA 


GCATTGATTT 


GCATGTTAGG 


TAATCCTTTA 


9780 




gcuiai ro*T.A 


G LGTT AG CAT 


TGTTACTGCC 


GATTTAGACG 


AACAATAAGC 


TAATGAATTC 


9840 


20 


A CTTT AG ATT 


CAGCTGTTTC 


GGGGTTTGTA 


AC CATTCCAA 


ATGAACCTAA 


ACCACTTGAT 


9900 




ACGTTGACGA 


CAACAGGTTG 


TTCAGATTTT 


TCTAAGAGAG 


GGACGAATGT 


ATTCATCATT 


9960 


25 


PPT A ^/""» A T A P 

CGTACGATAC 


/■»/""» » * »T* ft /"* A WWII 

CGAATACATT 


CGT7JTGATAT 


ACTTCTTCAA 


CGTCACGAGG 


TGTCAATTTG 


10020 


GAAGGTG LTG 


A A A A • ! ** IV ' It PP 

AAAATTGACC 


AGATATACCT 


GCATTGTTAA 


TGAGGATATC 


AAGACGGCCT 


10080 




TCTTTTTCAG 

TPT A P A T A AT* 

1 I ACATAAT 


CAATCATGTT 

P A A P A PPT J\ *\ 

GAAlACCTAA 


ATAAGCATTT 


TTGACTGAGT 


AGT CACTTGT 


AACAT CTAAT 


10140 


30 


Tl~l~l~l~GTGAT 


GC 1 IXiX"i"GTC 


CTCTTACATC 


ATTCCGAGAA 


10200 




LLIA1A1AAA 


CTTTGTAACC 


CAATGCTTTA 


AGTGCCTCTG 


CACTTGCATA 


GCCTAACCCT 


10260 




1 I A 11 (jL LlL 


LTGTGATTAA 


PA ^ A A T**TT*»T» '•K 

CACAAl 1 1 1 A 


GTCATTACGT 


CCCACCTCAT 


CTAAATAAAT 


10320 


35 


rST*fT& A.T A » » 

VjI 1 1AAXAAA 


TAATTTCTGT 


ACGC X i GAAT 


TGAAATATGG 


CGATGCTCTA 


TTTGGAAGGC 


10380 




"iiriTTirErTri 

AAAlnLAt X A 


f "1* , I*P A T A A TV" 

Vjr TTGATAATG 


ATTGCAACAG 


CATATCTGTT 


TTGAAtTCGT 


GTAAGTGTCG 


10440 




1LA1«,GL1 1 1 


TftRTiTMk PTP 

1 AAATAAGTC 


ATA ATA A A A A 

ataataaaaa 


^^^^ K n n fmt ****** 

TCAAATAATT 


CTTGATAAAA 


TGCGCTTTGG 


10500 


40 


iAnnnACG 1 A 


ATTTATTGTT 


GCCTGCTTCA 


ATACATTG CA 


GTAGTGCCTT 


ATTATCGATT 


10560 




1 1AAA1 IGiA 


AAAlxAl AA1L 


T A A PP 7\ /-»7\ /*»T 

TAAGGACACX 


P A A n P / «T< 

TG CATAAC CT 


CATAATTAGA 


ft #Y*)^-* ft *WH ft ^ 

ATGATAGTCA 


10620 


45 


1L1 1 1AA1 1 i 


vjv- 1 i AAAAlVj 


7\ / A T* A AAA 

AGTGATAAAA 


ATAT^/^A APPT 

A 1 ATCAAGG T 


CTCTTTGTAT 


GACGTAGTAG 


10680 




CA 1 AAA i C vj C 


TTTTATCTTT 


P AAA ill/ ' 1 1 »/^/^ f\ 

GAAATGTCGA 


HPAPA K1V"T«/1/* 

TACAATGTCC 


CCATACCGAT 


ACCTAGTTCT 


10740 




TTAGCAATAC 


GATTCATACT 


AATGTTTTCA 


acgccttctt 


CATCAAAAAG 


TTTGTGCGCT 


10800 


SO 


ATTTCTTCAA 


TTCGTTGCCT 


ATTCTCTTTT 


GCATcrrrrc 


GCATGATTAC 


ACCTACTTAA 


10860 




AATTCTCTAA 


aATTGACAAA 


CGGATAACTC 


T C CGTTT ATT 


ATAAAACGTG 


TTAAGAAAGT 


10920 




TAGCAATGAA 


TTTGCAATAA 


CTATTAAATA 


TCATAAAAGA 


AAAGAGTGTT 


GATAATGTCT 


10980 
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ACCTTATCGG TTCAAATGAT TGCTGAAAAA CTGAATGTCA CTACAGAAGA TGTGGAAAAA 11X00 

GTATTAGCTA TGACAGCGCC ACTAGGCATT TTTAGTCATC AATTACAACG ATTTATTCAT 11160 

TTAGTATGGG ATGTCAGAGA TGTAATAAAC GACAATATTA AAGGAAATGG ACAAACACCA 11220 

GAACCATATA CGTATTTAAA AGGTGAAAAA GAGGACTATT GGTTTTTAAG A 11271 
(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6261 base pairs 

(B) TYPE: nucleic acid 

( C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

20 CAACCCGTTC AGAACAAAAT AAAAAC CGTA CAATTTTATC ATCTTAATGA TTATTGTACG 60 

GAAAAACTTT TTTACATCAT ATCTGCATGT GCATAAT CGA TATCGGTAAA TTTATTATAT 120 

TGTTTCATAA AATGTAACTT AACTGTG CCT GTTGGACCGT TACGTTGCTT AGCAATGATA 18 0 

ATTTCAATTT CACCGTTTTC ATCATTCGTT TGTGGCTCGA AACCACCATC ATCGTCATCA 240 

TCTTCATCGC CGCCACGGTT ATAGTAATCA TCACGGTATA AGAATGCAAC GATATCGGCA 300 

TCTTGCTCAA TCGAAC CAGA TTCACGAATA TCACTCATCA TTGGACGTTT ATCTTGTCGT 360 

TGTTCAACAC CACGAGATAA CTGACTTAAT GCGATAACTG GACATTTTAA TTCACGGGCT 420 

AATGCTTTTA ATGTACGAGA GATTTCAGAA ACTTC CTGTT GTCTGTTATC GGACGCACGT 480 

GAACCACTAC CTTGAATCAA CTGTAAGTAG TCAATCACAA TCATGTCTAA GCCATGTTCT 540 

TGCTTTAATC GACGACATTT AGAACGTAAA TCATTAATTC GAATACCCGG TGTATCATCA 600 

ATAAAAATCT TCGTACGTGA TAATTTACCT ACCGCTATAG TAAAACGACT CCAATCTTCC 660 

40 TCAGTCATAG TACCCGTTCT TAAGCGGTTT GAGTCAACAT TTCCAGAACT ACAAATCATA 720 

CGTGTGGCTA ACTGATCAGC ACCCATCTCT AGCGAGAAAA TACCAACTGT ATACATATCT 780 

TCATGCGTTG CAACTTTTTG TGCAATATTA AGTGCGAACG CAGTCTTACC TACAGATGGA 84 0 

CGCGCTGCAA GGATAATTAA ATCATTTCGG TTGAACCCTG CTGTCATTTG GTCTAAATCT 900 

CGATATCCTG TAGGTATACC TGGTGTTTGA CCACTATTTT GATCAAGCTC TTCAGCTGTT 960 

TCATACACTT GTCCTAAGAC GTCTCGAATG TCTTTAAAGC CATCGCTTTC ACGAGAAGAT 1020 

GATAGCTCTA AAATTCGACG TTCTGCATCA CTTAAAATCG CATCTAGTTC AAGTTCATCA 1080 

TTATATC CAT CATTGGCAAT ACTATCTGCA GTTTGAATCA ATCTACGTTT TAATGCATGC 1140 
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TCTGCAAGAT ATTGCGGGCC ACCCGCTTcA TTCAACGTAC CTTCCGTCGA TAATTGATCC 12 60 

ATCAATGTTA CAACATCAAT TTCTTTATTA TCTTCATTTA AGTGCATCAT TGCACGGAAA 13 20 

5 

ATATGTTGAT GGGCACCCCT ATAAAACGAC TCAGGAAGCA AAACTTCCTG AGTAGTATTA 13 80 

ATCAATTCTG GATCTATAAT AATTGAACCT AAGACAGACT GTTCAGCTTC ATT G TTATGC 144 0 

GGCATTTGAT TTTG CTCATA CATTCTATCC ATGAATGGTT ACACCTCTTA TTTCAATCCA 1500 

W 

ACTTTATTGT TCAACTGTGT GTACGCGAAT TGTACCTTCA ACTTCTTTAT CTAATTTAAC 1560 

AGGTACATTC GTATATCCTA GGG AATGAAT TCCATTTGGT AAATCCATTT TACGTTTATC 1620 

75 AATTTTAATA TCATGTTGTG CTTTTAGTGC TTCGGCAATT TGTTTTGTAC TTACTGACCC 1680 

AAACAATTTA CCACCTTCAC CAGTTTTTGC TGaTACTTCA ACTTCAATGT TTGATAACGT 174 0 

TTCTTTTAAT GCTTTAgCAT CTTCAATTTC TTGTTGGCGT TCTTGTTTTG CACGTTTTTT 1800 

20 CTGTAACTCT AATTGTTTAA GGTTACCTGG TGTTGCTTCT ACAGCATAAT TCTTTTTCAA 1860 

TAAGAAGTTA TTTG CAT AAC CTACTGGTAC TTCTTTAACT TCACCTTTTT TACCTTTACC 1920 

TTTACCTTTA ACATCTTGTG TAAAAATTAC TTTCATGCAT CTTCACTCCT ACTTAATTGT 1980 

TCTGTAATTG CTTGTTGTAA TTGTG CTATC GCCTCTTCGA CTGTCACACC TTTAAGTTGT 204 0 

GTTGCCGCAT TGGTTAAATG TCCACCGCCA CCAAGTGCTT CCATTGTTAA CTGGACATTT 2100 

ACTGAACCGA GTGAACGCGC AGATATACCA ATCAGATTAT CTTCACGTCT CGCAACAACA 2160 

30 

TATGATGCTT CAATACCTTC TAAACTTAAC AGTTCATCTG CTGCTTGTGC AACTGTTACT 2 2 20 

GGATGATAAA TTTTATCGTC TGAACCATGC GcAATGGCTA TGCCATTATC TTCAACTTTT 22 80 

35 ACAGTTCGAA TTAATTCAGA TCGATTAATG TAAGTATCCA CATCATCTTT TAAGAAATGT 234 0 

TGCGTTAAAA TCGTATCTGC ACCATGTGCA CGTAAATAAC TCGCTG CATC GAATGTTCTT 24 00 

GATCCTGTTC GTAATGTAAA GTTTCTTGTA TCTACAATAA TACCTGCATA CATCACTGTT 24 60 

40 

GATTCAAGAC GTGTTAAACG TTGTTCTGTT GGTTGATATT CCAGTAACTC TGTTACCAAT 2520 

TCAGCTGTCG AACTTGCGTA TGGTTCCATA TATATCAACA ATGGATTAGA GATGAAGCTT 2580 

TCACCACGTC TATGATGATC GATAACAACT TTACGGTTTG CTTTATTTAA GACATTTTCA 264 0 

45 

TCTAAAACCA GTTCCGGTTT ATGCGTATCA ACAATCACTA CGGTTGTCTT AGATGTCATC 27 00 

ATATCCCAAG CATCATCTGA TGTAATAAAT CGCTCTCTTA ACTCTGGCTT TTTATCTATT 2760 

50 TCGTTCATCA CGCGTCGTAA TGTTGGATCA ATGTCAGTCT CATTTAATAC GATGTATGCT 2 820 

TCTAAATTAT TCATCATTGC AAATCTAGAC ACACCGATTG CTGCACCAAT TGCATCTAAG 28 80 

TCAGGACGTT TATGTCCCAT GATAATGACT TTGTCACCCT CTGCAAGGAT ATCTTTTAAC 2 94 0 
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CCATAGAAAC GCACATTACC ATTAATACTT TTAATTGCAA CTTGGTCGCC ACCGCGTCCT 3060 

AATGCTAAGT CTAGGCCTGA TTGTGATAAT TCACCTAAGT CGATTAAATT TTCAGTACCT 3120 

TCACCAACAC CGATACTTAA TGTTAATTGG GCACGATAAC CAACACTTTT TTCACGTAAT 3180 

TGACTCAAGA TATCAAATTT AGATTCTTCT AAGTCAGCTA ATATTTTTTG ATTTAAATAG 324 0 

GCTACGAATT GATCGGAACT GTATCTTTTG AAAAATATAT TATACTCAGT TGCCCATCGA 3 3 00 

CTAATGACAC GCGTTACCAT TGAGTTGATT TCCGAACGCT GCGTATCATT CAT ATTTTG C 3 3 60 

GTAATCT CAT CGTAGTTATC TAAAAATAAT GTCGCAATGA TTGGTTTAGA ATTTTCATAT 34 20 

AGTTCATTTG TTTGTACTTG TTCAGTTATA TCAAAGAAAT AGAGGCAGTG ATCATTCTCA 34 80 

GAATAACGTA CTTGGAAATG ATACTGATTA TATTCTATTT cAACGGATTT CACTCTATCT 3 54 0 

AATTGCTTTA AAATGTTTGG AAATACTTCA TTTACAGATT CAGAAATGAC ATTCGCTTCC 3600 

20 ATATGATCTG TCATAAATTG GTTAACCCAT TCGATGTGAT CATTTTCATC TAAAACAATG 36 60 

AT AC CAATTG GTAAATGTTT GATTGCTTTA TTATTTGTTG TTGAAATTTG AGCACTCAAA 3 720 

C CAT CT A CAT AACTATCCAT TTTCATTAAA GCTTGTCTGA ATAAAATGAT GCTAACAATA 37 80 

ATCATCACGA CAAGAACGAT AGATGCAATT AGTGCTATAA GACTATTAAA GATAAACCAT 3 84 0 

ACACCCATTA AAACAATTGC TGTGATGATC ATGATGACAA ATGGTATTAG TAAAGCTTTC 3 900 

TTAGTGGACT GCCGATTCAT TATTCCACCT CTATTCACTT TTTAGAATTA TTTTTCATGA 3960 

TTCGCTTCAA ATTCAAACTT AAATCGATAA CACCAAGTAG TCCTACAATA TGTGTCGTAG 4 020 

GTGTCAGTAT TGTACCGATA ACCAATAGTA AAATCGTTAC TGCATTCGGC AAACCTTTCG 40 80 

CTTTACCAAA GAAATGAATA ACACTTAAAC CTTGAATATA CATTACTAAT GATAACACAA 4140 

GTTGGAAGTT TAAAAGAATG CTCTGGAACA CACTCGGTTG ACCTGTAAAT AATAAACATA 4200 

TGATAACAAT AATGTATATC CATAATAAAA TACCGCTCAT TTGCCACGCG AAAAGTGGCT 4260 

40 TAAATACAGG TGTAGCGATT TTAAATTTTC GTAAAATCGG AAATGTAACG ATTAAGTTAA 4 320 

TTAAGACGAT TAAAAATGTA ATGATAATGA TGAAACCTGG TAATTGAACG GTCGCTTGTC 4380 

TAAACCCTTC TTCTAATATT TGGGTCATAT TCGCATCGGC ACCGCTCATC GTAATCGCTT 444 0 

CATGTAATGT TTGCTTGAAA GGTTTTACTA TGCTCGCTGA TGGTGGAATC CTTCCGAATG 4 500 

TTTGTAGTAA CATAAAAGCG ATTAATGAAA TTnArCTCAT CGCTACTGTT GTTACGTATA 4 560 

ATATTCTTTC TTTAGACGTT CTTTCTTTGA G CAATTG AC C AATAATTAAA CTTGCAATTA 4 620 

AGACTAATAT GATGGCACTT AAAACGAAAG TATTACCTAA AACAGTTGTT ATAATTACTG 4680 

TAATAAGTGC ACTAATCCCG AAAGATTGTA TTGATTTATT CCATAAAACG ATACCTGGTA 4 74 0 
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CAAATACCAA 


CGCAATCGTT 


GCAATTATTG 


TTGCTTTAGG 


TTGTATrrrr 


GAAAACACAT 


4860 




AAGCCACTCC 


CATATTTTTA 


ACTATAGCTA 


TTATTTTAAC 


CTCTTTAATG 


AAAATTAACA 


4920 


5 


ATTTATAGAT 


TGTATGCTTC 


TATTTCATTT 


AATTGAATAA 


TAACTTTCAT 


GTTTTATAAG 


4980 




TAATTAACAT 


ACTCATTTGA 


ATCGCTTTTG 


TGTG CTTTCA 


TTTTCAACAT 


GATTATTTAA 


504 0 


10 


TCCCACTACA 


TAG CAATC AA 


GCTTGATTTA 


GATTTACAAT 


ACATTTCCAC 


TCTCATGTAC 


5100 


TCTAGATGTT 


TTTGAATATG 


ATAACTGTGA 


TTTAGTGGCT 


TCATTCTTTG 


AAAATATATA 


5160 




TTATTACTTA 


CGCTTAAAAT 


GCTTTAAATT 


TAAGAAATGA 


TATAAGTTAG 


GTGCCCAGGT 


5220 


IS 


ACTAAAGTTT 


AGTAGGaATC 


CATCATGCCC 


AACATTATCA 


GGCACGAAGA 


AATGACGATG 


5280 




ATATTTAAAA 


CGTTCACCTA 


ATGCACGAAC 


TTGATCATCC 


GGATATAGCA 


AATCATCTAT 


5340 




GAACCCCATC 


GTTAACACTT 


TTGTTTCTAA 


ATTTTTAAAA 


ACATGCGTTA 


CGTCTGTGCG 


5400 


20 


ACCTCGGTCA 


ATGTTGTGAC 


TATCCAATAC 


ATCTAGCAGT 


GTCAGATAAC 


AATTCAAATC 


5460 




AAAATGTTCT 


TTAAATTTAT 


TACCTTGATG 


TTGTTGGTAT 


GCGACTACTT 


CATCCGGCGT 


5520 




AAAACGTTCA 


TCATAACTTT 


TTGATGATCG 


ATATGTCAAA 


AAACCTAATT 


GGCGTGCAAT. 


5580 


25 


ACTTAGACCT 


TCCTTACCAC 


CAAGATGAAT 


GGCTTGCCTT 


GCAATTTCAT 


TGAAAGCTCT 


5640* 




ACTATAAGAT 


GATGTTCGAC 


TTGTTGCAGC 


AAGGATAATG 


GCTTTATCTA 


CTTCAAACTG 


57 00 


30 


TTGATTGTAG 


AGTAGTTCCA 


TTGCTTGCAT 


ACCTCCAAGA 


CTTCCCCCTA 


TTAAAATATT 


5760" 


AATCTTATCA 


TAACCAAGGG 


CTTGTATACC 


TCGTTCATTC 


GCTCTGACTA 


TATCTCTTAA 


5820 




TGTTAATTTT 


TTAGGAAAAT 


GAGGGTCGTT 


TAAAGGTGAA 


CTTGAACCGA 


AAGGACTACC 


5880* 


35 


AATAACATCA 


AATGTTAAAA 


ATTGATAATC 


GTGAATGGGT 


ATATATCCCC 


CATCAATAAT 


5 94 0. 




TTCTCGCCAC 


CAACCCGGAT 


AATCATCTGT 


TCCATATGTT 


AAATGATTGC 


CAGTTAATGC 


6000 




ATGACAAACT 


ACAACTAATG 


GTTGTCCATG 


ATAACCGACA 


TGCTCATATC 


TCAAACGCAA 


6060 


40 


GTnATCTATG 


ACTTCCCCAG 


ATTCTGTAAT 


AAATTCCCCT 


AAATTTAAAG 


TATCTACTGT 


6120 




GTAATTTGTC 


ATTGTTCTTT 


CCTCCTTAAA 


CAAAAAAACT 


TCTCACCCTA 


TTGAAAAGTA 


6180 




AGAAGTCTTT 


ATACTTATCA 


TTCGAGTAAC 


TCGTTGGTTT 


TAGCACCGTG 


CTATAAAGTC 


6240 


45 


GGTTGCTGAA 


GTATCACAGG 


G 








6261 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS : 
so (A) LENGTH: 1222 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 



5 



10 



15 



25 



30 



35 



ATGCGATTAA 


CTCTGGAAAT 


ATCTTTTCCA 


TATTTACGTn 


TTAAATTATT 


CAGCAAATTC 


60 


ATACGAGaTT 


CATACTCGTT 


yAACACTTGT 


TCGTCGAATT 


CTGTATTAGC 


CATTTCATCA 


120 


TATAACTCAT 


GTTTTG CATC 


TTCTAAAATG 


TAGTAAAATT 


GATCAATATC 


TTCTTTTAAT 


180 


TTGTCATATT 


TGTTTGGAAC 


TATATCGTTT 


ATTGTTAACA 


AATGGTTGCT 


TAGTTCATAT 


240 


AAACGATCAG 


TGATAGCATT 


TTCATCCGTT 


AATGTCATAT 


ATGCGTTATT 


AAGCGCTAAG 


300 


CTTAATTTTT 


CAGAGTTTTG 


AATGCGTTTA 


ATATCTATTT 


CAAGTTG CTC 


TATTTCGCCT 


360 


TCTTTTAGAT 


GTGCTTCAGA 


CAATTCTTCT 


AATTGGAATT 


TCATTAAATC 


TAAACGCTGT 


420 


AGCAATGCTT 


GGTCTGCTGA 


TTCTAAATCT 


TCTAACTCTT 


GCTTTTTGGC 


TTTATAATTT 


480 


TGAAAAGTTT 


GGTGATATTT 


ATCCAACAAA 


TCTTGATAAC 


GTGATTCTGC 


GTAATTATCC 


540 


TV J\ T A TV T/?TT& 


AATGGTATTT 


TTGTTTCAAC 


AAAGACTGCG 


TTTCATGTTG 


GCCATGAATA 


600 


TCTAATAATT 


CTTGCATAAC 


TTTTCGTAAA 


TCTTGTAAAG 


TAACTGTTTG 


ATTATTAATT 


660 


TTACAAAGAC 


TTTTACCAGA 


GCTGAAAATT 


TCCCGTTTAA 


CTAATAAAAA 


ATCTTCATCT 


720 


ACATCAATAT 


CCATATTTTf 


CAATATATGT 


ATAGCATCTT 


TACTCTCGTC 


AATATCAAAT 


780 


ATACCTTCGA 


TGACAGCCTT 


TTTTTCACCA 


TGTCTTACAA 


AATCAGATGA 


AGCTCTCATT 


840 


CCAATTAATT 


GTCCAATTGC 


ATCTATAATA 


ATTGACTTAC 


CTGAACCCGT 


TTCACCACTT 


900 


AAAACAGTTA 


AACCATCAGA 


AAATTGAATT 


TCTAATTCTT 


CAATAATAGC 


AAATTGCTTG 


960 


ATTGATAAGG 


TTTGTAACAT 


AAACTCATCG 


CATCCTTATA 


ACAAATTGAA 


AATTCTTGAC 


1020 


TTGATTTCAT 


CACTTGCCTC 


TTTGCTTCGA 


CAAATAATTA 


AACAAGTATC 


ATCACCACAA 


1080 


ATTGTGCCTA 


GTACTTCTTC 


CCAATTGATT 


TGGTCTAATA 


TAGCTCCAAT 


AGATTGTGCA 


1140 


TTACeAGGTA 


TGTTTTTAGA 


ACAAGTAAAT 


TATCAGTACC 


ATCTATATTA 


ACAAAGGAAT 


1200 


CCATTAAATA 


ACGTCCCAAT 


TT 








1222 



(2) INFORMATION FOR SEQ ID NO: 14: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 
4 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

50 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
TTTGTTATTA TTACnTnAAA TAATTGCATT ACTTTTTACT GATGGTACAA CTTTCCATCC 
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TTCTTTTGGC ACGACATAAT TGTCTTTATC TTGAACTAAA TATCCGCCAG ATACTGAAAC 180 

AAACTCTTCT TCGTTACTGT CTATAGTCAT ATCAATTTCT AATAATCTTA CATTCTTCTT 24 0 

TTGTTTTAAA ATATCTAATG CTTCATCTGT AAATTTTGGT GCAATAATGA CTTCCAAAAA 300 

GATACTATGC AATTGCTCTG CTAACTCAGG TGTTACAGCT CGGTTTAATG CAACAATTCC 360 

ACCAAATATT GATTGACTAT CCGCTTCATA CGCATGTTGA AATGCTTGTT CTATCGTGTC 4 20 

ACCGATACGA ACACCACATG GATTCATGTG TTTAACCGCA ACTGTAGCAG GTGTATCAAA 4 80 

CTTTTTAACT AAAGCTAGTG TAGCATCTGC ATCTTTAATA TTGTTATAGC TTAATTGTTT 54 0 

1S CCCATGTAAT TGTTTAGCGC CTGCAATCGT GTGCTTAGCA TTCGAAGTTC TCACAAAATA 600 

CGCTGATTGT TGTGGATTTT CTCCATATCT TAAAGTTTCT TTATCCCCTT TAAAGAAACG 660 

TACAATCGCT TCATCATATT CTGCAGTATG CTCAAAAACT TTAATCATTA ATGATTGTCT 720 

20 ATATGACTCA TCTAACGAAT CGTTTCTTAA TCGCGTCAAT ACTTCTTGAT AATCTGCCGG 7 80 

ATGTAGAATT GTTGTTACAT GTTTATAGTT TTTAGCTGCA GCACGTAACA TTGTTGGACC 840 

ACCAATATCA ATATTTTCAA TTGCTTCGTC CATCGTCACA TCAGGGTTTG CAACAGTTTG 900 

TTGGAATGGA TATAAATTAA CTACTACCAT ATCAATTAAA TCTATATGTT GTTCTGATAA 960 

TTCATTTAAA TGCTGCGGTT TATTTCGATC AG CT AAAATG CCACCATGAA CAGCCGGATG 1020 

T 1021 
(2) INFORMATION FOR SEQ ID NO: 15: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3759 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

TCATTCACTC CTAAATTGTT ATTACACTAT TACACaTAGC TAATCATCAA TGTGAAATCA 60 

CCTTCAAAGA CACTATCCAA ATCTTCAGAA GTCAAAATAA AGTTTGTACC AGTAGTCAGT 120 

TTGAAAATTT CACCATCGAC AATCATTTGC CCTTCGCCTT CCAACACTGT AACTAAACAG 160 

AACTCTCTAG GCTTCATATA ATTTAACGTG CCAGAAATTT CCCATTTAAC - CAATGTAAAG 24 0 

AAATCATTCG ATACAATGTG TGTACACTTA TGGTTTTCAA TAATTTCGCT TTCAGGCAAA 300 

ATATTAGGTA ATGGTGCATT GTACTGAATA ACGTCTAAAG CTTTTTCAAT ATTTAACGGT 360 

CTATCATTAT ATTGATTATC TTGACGATTG AAATCATAAA GTCTATATGT AATGTCTGAC 420 
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ATAAAAtAGa ATTCyCCAGG kTTTAC t TTA At a C ATCyAA gTAtCGaCtC tATCGTTCCG 54 0 

TGTTGAACAT GATTCGCAAC TTCTTCTCTA GACTCTGCTA ATGTCCCtAT AACTATTTCT 600 

GCATCTTCTT CTGCATCTAT AATATACCAA CATT CAG ATT TGCCATATTG CCCgTTTTCA 660 

TGCTCATAAG CATAAGAATT ATCAGGGTGC ACATGAATAG AAAGTGATTC TCTTGCATCC 720 

ACTATTTTAG TTAGAAGCGG AAAAT CTTTG CTTGGGAAAT CACCAAACAA TTCACGATGT 78 0 

TCTGACCAAA TACGGTCTAA TGTTTGACCT TGATATGGTC CATTAATAAT CTCG CTCGTA 84 0 

CCATTTGGAT GTGCTGACAC ACACCAACAT TCCCCCAGTT GTATCATTGT CTAATTGATA 900 

TCCAAACTCA CTTAGACGTT GACCGCCCCA TAATTTTGTT TTTAAAATTG GTTGTAAAAA 960 

TAATGGCATT GTTGCACCTC CATTGTGATT AAGTAAGCAA TAGAACTCTG ATGTTGTTGT 1020 

TCCATTATAT TTTGATTTTG TTCTCATTTA CATCGTATTA TTAACTTCCA CATTTCAAAT 108 0 

20 TAACTATTAG TGATTGTACC ATATTTACTA ACATTGCAGT ACTGCCAATT AAAAGnGCTT 114 0 

CACTTAAATT TACAGTACTT TAACATTTTC AAAAATTTAT AGCATAGAGA TT AT AT CT CT 1200 

CTTACATTTG TACATATTTC CCTTTAAATT TACTCGCCCA TTATACCAAT TAATAaACAA 1260 

25 CTTTAATAGT TGTGCCATAC ATTGTTCAAA TTCTTTGTAA AACGCATAGA CAATACGTAC 1320 

TTATTCATAC TTATAATTCA TCATTTTCAA AAAATAACGA GTTACGAAAA AGTAACCCGC 13 80 

TTCAAATCAT ATTT ACT AT C CTTATTAATC CGTTTCATTT TCAAATTGAG TTAAAGCATC 144 0 

TTTAATGTCC TGATCACCAC TAATAATTTG AAACTCTTGG TGATT AAAAT GATTGGATGT 1500 

GACAATTTCT TTTAATACTG TCGCAACATC TTCTCTAGGA ATTTCACCTT T AC CAT CAAA 156 0 

ATATTGTGCA GCTTCTATCT TTCCAGATCC TGCTGCATTT GTAAGTGCCC CTGGATGTAA 1620 

AATTGTATAA TTCAAACCTG nAACGTCTTA AATAGTCATC AG CGTAATGT TTAGCTATTG 1680 

TATATGGCTT TAAATCACCG CTATCATCAA AAGCCTGACG TCTCGAATCA TATGTTGAAA 1740 

CCATGACATA GTGTTTAATA TTGGCCTCTT TACTCGCAAT CATTGATTTA ACAGCACCAT 1800 

CTAAATCGAC AATAATTGTT TTATCTGCAC CCGTGTTCCC TCCAGAACCT ACTGAAAAGA 1860 

TAACTTTATC GAATGGTTTA AACGTCTCAG TTAAAGTCTC TATTGAATCA TTTTCAACAT 1920 

45 CAACAAGAAT TGCTTTCATA CCTTGTGATT TTAACGCATT AAGTTGATCT GATTGCCTAA 1980 

CACCAGCAGT AAATGGTACA TTTTCTTTTG CTAATTGTTG CACTAGTAAC GAACCTACAC 204 0 

CGCCATTAGC ACCTATAACC AAAATATTCA TTTACAACAC TCTCCTATkT ATTATTCTCT 2100 

SO 

ATGCCATACC ACTTTATGAG ATATGTAAAA CTTGTTACAA CTATAAAAAT CAATTGACAT 2160 

ACTACTGGGA ACGTATTAAA TTAATATATG AACAAATATT CATATGAAAG GATTGTCATA 2220 

55 



30 



35 



40 



BNSOOC1D- <EP 0786519A2_I_> 



256 



10 



25 



EP0 786 519 A2 

tCaAGGCATT AGcGATTACA ATCGAATACG TATCaTGGAA TTGTTATCaG TCAGCGAAgC 234 0 

AAGTGTTGGT CACATTt CAC ATCAATTGAA TTTATCTCAA TCAAATGTCT CGCACCAATT 2400 

AAAATTACTT AAAAGTGTGC ATCTTGTGAA AGCAAAACGA CAAGGCCAAT CAATGATTTA 2460 

TTCATTAGAT GACATCCACG TAG CAACTAT GTTAAAGCAA GCCATACATC ACGCGAATCA 2520" 

TCCTAAAGAA AGTGGGTTAT AATATGTCTC ATTCACATCA TCATCATGAC CATATGCATA 2580 

GTCATGTAAC TACAAATAAT AAGAAAGTAT TGTTTATATC GTTTTTAATA ATCGGTCTAT 264 0 

ATATGTTTAT CGAAATCATC GGCGGTCTCC TTGCTAACAG CTTGGCATTA CTATCTGACG 2700 

75 GTATCCATAT GTTTAGCGAC ACATTCTCAT TAGGTGTTGC ACTTGTCGCA TTTATTTATG 2760 

CTGAAAAGAA TGCCACAACT ACAAAAACAT TTGGTTATAA ACGTTTCGAA GTACTCGCAG 2820 

CGTTATTTAA CGGTGTAACG CTTTTTGTAA TAAGTATTTT GATTGTTTTT GAAGCGATTA 2880 

20 AACGTTTCTT TGTTCCTTCT GAAGTTCAAT CAAAAGAAAT GTTAATCATT AGTATTATCG 294 0 

GTTTAATTGT CAATATCGTT GTTGCATTCT TTATGTTTAA AGGCGGCGAC ACTTCACACA 3000 

ATTTAAATAT GCGTGGTGCT TTTCTACATG TTATCGGAGA CTTATTAGGT TCAGTTGGGG 3060- 

CCATTACTGC AGCTAkTTTA ATTTGGGCAT TTGGATGGAC AATCGC CGAT CCTATCGCAA 3120 

GTATTTTAGT TTCCGTTATT ATTTTAAAAA GTGCTTGGGG TATCACAAAA TCTTCAATTA 3180 

ACATTTTAAT GGaAGGCACA CCAAGTGATG TTGATATAGA TGAAGTTATA ACTACTATTA 324 0 

AAAAGGATTC ACGAATACAA AGTGTGCATG ATTGCCATGT TTGGACAATT TCAAATGATA 33 00 

TGAATGCATT GAGTTGTCAT GTTGTTGTAG ACCATACATT GACAATGAAA GAATGTGAAT 3360 

TATTATTAGA AAaCATTGAG CATGATTTAT TACATTTAAA TATTCACCAT ATGACTATTC 34 20 

AATTAGAAAC GCCTAATCAC AAACATGATG AATCGATTAT ATGTTCAGGA ACACATAGTC 34 80 

ATTCACATAA CCATCATGCT CATCATCACG CGCATGTACA TTAATAATTT TAACCTACTG 3 540 

40 CCATTGCATC GATTAAACTT TTCAATGGCA GTAGGTTTTT TATGTCTTTA TGGCGACTTG 3600 

TTTGGTCTTT GATGATGCAA TGTTTATTAA CAAATTTTCA ACTATTATTT CTTACATTAG 3660 

TCATATTTTT GACAATTTAC TATTATAATT CTCTAACTTT AGTCACTTTA ATTAATTTTT 3 720 

45 ATTAGATATT AATATGAAAA TAACGTGTTT TTTGTTATT 37 59 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 13086 base pairs 

<B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY : linear 

55 



30 



35 



^NSOOOD: <EP 078651 9A2_I_> 



257 



10 



15 



20 



EP 0 786 519 A2 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

TAATTATCGC GCATAACAAA ACATTAGCAG GACAATTATA TAGTGAGTTT AAAGAATTTT 60 

TTCCTGAAAA CAGGGTGGAA TACTTTGTAA GTtACTATGA TTATTATCAn CCAGAGGCAT 120 

ACGTACCGTC TACTGACACT TTTATTGAAA nAGATGCCTC AATCAnTGAT GAAATTGATC 180 

AACTACGACA TTCTGCTACA AGTGCATTAT TTGAACGCGA TGATGTAATT ATTATTGCTA 24 0 

GTGTAAGTTG TATATATGGT TTAGGTAATC CTGAAGAATA TAAAGATTTA GTAGTAAGTG 3 00 

TTCGAGTTGG TATGGAAATG GATAGAAGTG AATTACTTAG AAAACTTGTC AGATGTGCAA 360 

TATACACGAA ATGACATCgA TTTc CAACGA GGAACGTTTC GAGTGCGTGG TGATGTAGTG 4 20 

GAAATATTCC CAGCCTCTAA AGAAGAACTT TGTATAAGGG TTGAGTTTTT CGGCGATGAG 480 

ATTGACCGTA TCCGAGAAGT TAACTACCTA ACAGGTGAAG TGTTGAAAGA AAGAGAACAT 540 

TTTGCGATAT TCCCAGCTTC TCACTTCGTA ACACGTGAAG AAAAGTTGAA AGTTGCGATT 600 

GAACGTATTG AAAAAGAATT GGAAGAACGA TTGAAAGAAT TACGAGATGA GAATAAATTA 660 

CTAGAAGCGC AAAGGTTAGA ACAGCGTACC AACTATGATT TAGAAATGAT GCGAGAGATG 720 

25 GGATTCTGTT CAGGAATTGA AAACTATTCC GTACATTTAA CTTTGCGACC ACTGGGTTCG 7 80 

ACACCATATA CTTTATTGGA TTACTTTGGC GATGATTGGT TAGTAATGAT TGATGAATCA 840 

CATGTGACAT TACCGCAAGT TCGAGGCATG TATAACGGAG ACAGAGCGCG TAAACAAGTT 900 

30 TTGGTGGATC ATGGGTTTAG ATTACCGAGT GCATTAGATA ACCGTCCACT TAAATTTGAA 960 

GAATTTGAAG mAAAGACAAA ACAACTTGTG TATGTATCTG CAACGCCTGG ACCATACGAA 1020 

ATTGAACATA CGGATAAGAT GGTTGAACAA ATTATTCGTC CTACTGGTTT ACTGGATCCT 1080 

AAGATTGAGG TTAGACCTAC TGAAAATCAA ATTGACGATT TATTAAGTGA AATTCAAACA 1140 

AGAGTg AG CG TAATGAACGC GTACTTGTTA CAACGCTCAC TAAAAAGATG AGTGAAGATT 1200 

aACCACATAC ATGAAAGAaG CGGGTATTAA aGTtAATTAT CTGCATTCAG AAATCAAGAC 126 0 

ATTAGAACGA ATTGAAATAA TTAGAGACTT ACGAATGGGT ACATATGATG TTATCGTAGG 1320 

TATTAATTTA TTAAGAGAGG GTATTGATAT ACCAGAAGTT TCTCTAGTTG TCATATTAGA 1380 

TGCAGATAAA GAAGGGTTTT TACGTTCTAA CCGCTCATTA ATTCAAaCAA TAGGTAGAgC 144 0 

TGCG CGTAAC GATAAaGGTG AAGTCATTAT GTATGCCGAT AAAATGACTG ATTCGATGAA 1500 

GTATGCAATT GATGAGACAC AACGTCGTCG AGAAATACAG ATGAAACATA ATGAAAAACA 1560 

SO TGGTATTACA CCTAAAACAA TTAATAAAAA AATACATGAT TTAATTAGTG CTACTGTTGA 1620 

AAATGACGAA AATAATGACA AAGCACAAAC TGTGATACCT AAGAAGATGA CGAAAAAAGA 1680 
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TTTCGAGAAA 


GCTACAGAAT 


TAAGAGATAT 


GTTATTTGAA 


TTAAAAGCAG 


AAGGGTGACA 


1800 




AGTAAATGAA 


AGAACCATCC 


ATAGTAGTAA 


AAGGTGCTCG 


TGCGCATAAC 


TTGAAAGATA 


1 OCA 


5 


TTGATATCGA 


ACTA CCTAAA 


AaTAAATTAA 


TTGTTATGAC 


AGGTTTATCT 


GGGTCAGGTA 


1920 




AATCGTCATT 


AGCATTCGAT 


ACTATATATG 


CTGAAGGACA 


ACGACGTTAT 


GTTGAATCAT 


1 9 BO 


10 


TAAGTGCCTA 


TGCGCGTCAA 


TTTTTAGGCC 


AAATGGACAA 


ACCAGATGTT 


GATACAATTG 


A. U «« \J 


AAGG ATT AT C 


GCCAGCAATT 


TCAATAGATC 


AAAAAACAAC 


AAGTAAAAAT 


CCAAGATCAA 


J, \J \J 




CTGTAGCAAC 


AGTAACAGAA 


ATATATGATT 


ATATACGTTT 


GTTATATGCA 


CGTGTTGGTA 


a. J. O \J 


15 


AACCTTACTG 


TCCAAATCAC 


AATATAGAAA 


TTGAATCGCA 


AACAGTACAA 


CAAATGGTTG 


2 ? y o 

AV 4* AV U 




ACCGCATTAT 


GGAATTAGAG 


GCACGTACAA 


AGATTCAATT 


ATTAGCACCT 


GTCATCGCTC 


? ? ft fi 
A. AV O U 




ATCGTAAAGG 


TAGTCATGAA 


AAGCTAATCG 


AAGATATTGG 


TAAAAAAGGT 


TATGTACGTT 




on 


TAAGAATCGA 


TGGCGAAATT 


GTTGATGTAA 


ATGATGTACC 


TACTTTAGAT 


AAGAACAAGA 


04 on 




ATCATACAAT 


AGAAGTTGTT 


GTAGACCGAT 


TAGTTGTTAA 


AGATGGAATT 


GAAACACGAC 


a. *■ O L? 




TAGCTGACTC 


TATAGAAACT 


GCCTTAGAGC 


TTTCAGAAGG 


ACAATTAACA 


GTCGATGTCA 


a. ^ a* \j 


2$ 


TTGACGGGGA 


AGACCTTAAG 


TTTTCAGAAA 


GCCATGCTTG 


TCCTATATGT 


GGATTTTCAA 


A. 3 O \J 




TCGGAGAGTT 


AGAACCAAGA 


ATGTTTAGCT 


TTAACAGTCC 


TTTTGGTGCT 


TGTCCGACAT 


264 0. 




GTGATGGCTT 


AGGCCAAAAG 


TTAACAGTCG 


ATGTAGACTT 


GGTTGTTCCC 


GACAAAGATA 


a> f \j \j w 


30 


AGACGCTAAA 


CGAAGGTGCA 


ATAGAACCTT 


GGATACCGAC 


GAGTTCTGAT 


TTTTATCCAA 


2760 




CATTGTTAAA 


ACGTGTTTGT 


GAAGTTTATA 


AAATCAATAT 


GGATAAACCT 


TTTAAAAAGT 


2820 ~ 


35 


TAACAGAACG 


TCAACGTGAT 


ATTTTATTGT 


ATGGTTCTGG 


TGACAAAGAA 


ATTGAATTTA 


2880 ^ 


CATTTACACA 


ACGTCAAGGT 


GGTACTAGAA 


AACGAACAAT 


GGTTTTCGAG 


GGTGTAGTTC 


2 94 Or: 




CTAATATAAG 


TAGACGATTC 


CATGAATCTC 


CTTCAGAATA 


TACACGTGAA 


ATGATGAGTA 


3000 


40 


AATATATGAC 


TGAACTACCT 


TGCGAAACTT 


GTCATGGAAA 


GCGATTGAGT 


CGTGAAGCkT 


3060 




TATCTGTTTA 


TGTAGGTGGT 


TTAAATATTG 


GTGAAGTAGT 


CGAATATTCA 


ATCAGTCAAG 


3120 




CGCTGAACTA 


TTATAAAAAC 


ATTGATTTGT 


CAGAACAAGA 


TCAAGCGATT 


GCAAATCAAA 


3180 


45 


TATTGAAAGA 


AATTATTTCC 


CGACTCACTT 


TTTTAAATAA 


TGTGGGACTT 


GAATATTTAA 


3240 






AG CTTCAGGT 


ACACTTTCAG 


GTGGTGAAGC 


ACAACGTATT 


CGATTAGCAA 


3300 




CGCAAATTGG 


GTCGCGTTTG 


ACTGGTGTCT 


TATATGTATT 


AGATGAGCCA 


TCAATTGGAC 


3360 


SO 


TG CAT CAAAG 


AGATAATGAT 


CGATTAATTA 


ATACACTTAA 


AGAAATGAGA 


GATTTAGGAA 


3420 




ATACTTTAAT 


TGTAGTTGAA 


CACGATGATG 


ATACAATGCG 


TGCGGCTGAT 


TACTTAGTGG . 


3480 



55 



JSOOCID: <EP 0786519A2.»_> 



259 



15 



20 



EP0 786 519 A2 

AGGTAATGAA AGATAAAAAA TCATTAACAG GACAATACTT GAGTGGTAAG AAACGTATTG 3600 

AAGTACCTGA ATATCGCAGA CCGGCTTCAG ATCGTAAAAT TTCTATACGT GGAGCTAGAA 3660 

5 GCAACAATCT TAAAGGGGTT GATGTGGACA TAC CACTATC AATCATGACG GTTGTTACAG 3720 

GTGTATCAGG TTCTGGTAAA AGCTCATTAG TAAATGAAGT ATTATACAAA TCATTAGCTC 3780 

AAAAAATTAA TAAATCTAAA GTAAAGCCAG GATTGTACGA TAAGATTGAA GGTATTGATC 3840 

10 AACTTGATAA AATTATTGAT ATTGATCAAT CACCAATAGG TAGAACGCCA CGCTCTAATC 3900 

CAGCAACATA TACTGGTGTG TTTGATGATA TACGTGATGT GTTTGCGCAA ACAAATGAAG 3960 

CTAAAATTCG AGG AT AT CAA AAAGGGCGTT TTAGTTTTAA TGTAAAAGGT GGACGCTGTG 4020 

AAgcTTGTAA AGGTGACGGT ATTATTAAAA TTGAAATGCA TTTTTTACCT GATGTTTATG 4080 

TTCCTTGTGA AGTGTGTGAT GGTAAACGAT ATAATCGTGA GACACTAGAG GTTACTTACA 4140 

AAGGTAAAAA TATTGCTGAC ATTTTAGAAA TGACTGTTGA AGAAGCAACA CAATTTTTTG 4200 

AAAATATTCC TAAGATTAAG CGCAAGTTAC AAACACTAGT TGATGTTGGT CTTGGATACG 4260 

TCACATTAGG TCAACAAGCT ACAACGTTAT CAGGTGGTGA GGCTCAACGT GTGAaACTTG 4 32a 

25 CATCTGAACT TCATAAACGT TCAACTGGTA AATCTATTTA TATCCTAGAT GAACCGACAA 4380 

CAGGGTTACA TGTTGACGAT ATTAGTAGAT TATTAAAAGT ATTAAACCGA TTAGTTGAAA 444 0 

ATGGTGATAC TGTTGTAATT ATTGAACATA AC CT AGATGT TATCAAAACA GCAGACTATA 4500 

30 TTATAGACTT AGGTCCTGAA GGTGGTAGTG GCGGTGGTAC TATTGTTGCG ACTGGCACAC 4 560 

CCGAAGATAT TGCTCAGACA AAGT CATCAT ATACAGGAAA GTATTTAAAA GAAGTACTTG 4620 

AACGAGATAA ACAAAATACT GAAGATAAAT AAGATTAAAA GAAGTGAAGG ATGTTATAAA 4680 

55 TTTATCCTTC GCTTCTTTTT ATTAATTTAG TAATGAATAG TAGAAAGAAA AGATGCGTAA 4740 

AAAGAATTAT GTTAAGATAG GGTCAATCTA GAGTAGTTAA ACATAAATCG AACTGGGAGT 4800 

GGGACAGAAA TGATAAAGAA TCACTAATGA TTTATTATGT AGTGGTTCTT TGTCATTAGC 4 860 

CACAGCTATT GTGTACTTAA AAATAGGaat GCaTgAGTGC AACTCATGCA TAAGaAATAC 4 920 

TAATTTCTAA AGAAAAAGTA TTTCTTTATG TTGGGGCCCC GCCAACTTGC ATTGTTTGTA 4980 

GAATTTCTTT TCGAAATTCT TTATGTTGGG GCCCCGCCAA CTTGCATTGT TTGTAGAATT 504 0 

TCTTTTCGAA ATTCTTTATG TTGGGGCCCC GCCAACTAAT TCCAATATAT CATTGTAGAG 5100 
CTTAGGTCAT TGATTTTTGG CTCGGACTTT TATGG CGATA TGAACCATGT AAATTAAGCA 5160 
50 AGCAATAAAT TAATGATTGA TATTGACTTG TAAAATAATA ACAATAATGA ACAATTAATA 5220 
TTTATTTTAG CTTTTCAATG TAGATTGGTG TTATATTTTT GATATGATAA GAAGAGATGT 5280 
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ACATTAAAGT 


1 TAGATTTAAT 


CGCTGGTGAA 


GAAGGACTAT 


CGAAGCCAAT 


TAAAAATGCT 


54 0 0 




GATATATCAA 


GACCGGGCTT 


AGAGATGGCA 


GGTTATTTTT 


CACATTATGC 


GTCAGATAGA 


5460 


5 


ATACAACTAT 


TAGGAACAAC 


GGAACTATCG 


TTTTACAATT 


TATTACCAGA 


TAAGGATCGC 


5520 




GCAGGTCGTA 


TGCGTAAACT 


ATGCAGACCA 


GAAACGCCTG 


CAATTATTGT 


GACACGTGGA 


5580 


10 


TTGCAGCCAC 


CAGAAGAATT 


AGTTGAAGCT 


GCAAAAGAAT 


TAAATACCCC 


ACTTATAGTT 


5640 


GCTAAAGATG 


CGACTACAAG 


TTTAATGAGT 


CGCTTAACAA 


CGTTTTTAGA 


GCATGCACTT 


5700 




GCAAAGACGA 


CATCTTTACA 


TGGTGTTTTA 


GTAGATGTTT 


ACGGTGTTGG 


TGTACTAATT 


5760 


15 


ACCGGTGATT 


CAGGAATAGG 


TAAAAGTGAG 


ACTGCGTTGG AATTAGTTAA 


ACGTGGGCAT 


5820 




AGATTAGTAG 


CAGATGATAA 


TGTAGAAATA 


CGTCAAATTA 


ATAAAGATGA 


ACTAATAGGG 


5880 




AAACCACCAA 


AGTTAATAGA 


ACATCTATTA 


GAAATACGTG 


GACTAGGTAT 


TATCAATGTT 


5940 


20 


ATGACTTTAT 


TTGGCGCGGG 


TTCAATATTA 


ACTGAAAAAC 


GAATTAGATT 


AAATATTAAT 


6000 




TTGGAAAACT 


GGAACAAGCA 


AAAG TT AT AT 


GACCGCGTAG 


GTCTTAATGA 


AGAGACGCTA 


6060 




AGTATTTTAG 


ATACTGAAAT 


CACTAAAAAA 


ACAATACCTG 


TAAGACCTGG 


TAGAAATGTT 


6120 


25 


GCGGTAATTA 


TTGAGGTCGC 


TGCAATGAAC 


TATCGATTAA ATATCATGGG 

• 


CATTAACACG 


6180 




GCCGAAGAAT 


TTAGTGAAAG 


ATTAAATGAA 


GAAATTATCA 


AGAACAGTCA 


TAAGAGTGAG 


6240 




GAGTAGGTTG 


AATGGGTATT 


GTATTTAACT 


ATATAGATCC 


TGTGGCATTT 


AACTTAGGAC 


6300 


30 


CACTGAGTGT 


ACGATGGTAT 


GGAATTATCA 


TTGCTGTCGG 


AATATTACTT 


GGTTACTTTG 


6360 




TTgCACAACG 


TGCACTAGTT 


AAAGCAGGAT 


TACATAAAGA TACTTTAGTA 


GATATTATTT 


6420 


35 


TTTATAGTGC 


ACTATTTGGA 


TTTATCGCGG 


CACGAATCTA 


TTTTGTGATT 


TTCCAATGGC 


6480 


CATATTACGC 


GG AAAATCCA 


AGTGAAATTA 


TTAAAATATG 


GCATGGTGGA 


ATAGCAATAC 


6540 




ATGGTGGTTT 


AATAGGTGGC 


TTTATTGCTG 


GTGTTATTGT 


ATGTAAAGTG 


AAAAATTTAA 


6600 


40 


ACCCATTTCA 


AATTGGTGAT 


ATCGTTGCGC 


CAAGTATAAT 


TTTAGCGCAA 


GGAATTGGAC 


6660 




GCTGGGGTAA 


CTTTATGAAT 


CACGAGGCAC 


ATGGTGGATC 


GGTGTCACGC 


GCTTTTITAG 


6720 




AACAATTACA 


TTTGCCTAAT 


TTTATAATAG 


AAAATATGTA 


TATTAACGGC 


CAATATTATC 


6780 


45 


ATCCAACATT 


CTTATATGAA 


TCCATTTGGG 


ATGTCGCTGG 


ATTTATTATC 


TTAGTTAATA 


6840 




TTCGTAAACA 






TCTTriTATA 


TTTAACTTGG 


TATTCAATTG 


6900 




. GTCGATTCTT 


TATAGAAGGA 


TTACGTACAG 


ATAGCTTAAT 


GCTCACAAGT 


AATATTAGAG 


6960 


50 


TTGCACAATT 


AGTATCAATT 


CTTTTAATTT 


TAATAAGTAT 


AAGTTTAATT 


GTATATAGAA 


7020 




GGATTAAGTA 


TAATCCACCG 


TTGTATAGCA 


AAGTTGGGGC 


GCTTCCATGG 


CCAACAAAAA 


7080 
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TTATGGCGTG TATACCGTCT TGTTAAATTT TCGAAAGTTT TTAAGAATGT AATTATCATT 72 00 

GAATTTTCGA AATTTATTCC AAGTATGGTA CTGAAAAGAC ATATATATAA ACAACTTTTA 7260 

5 AATATTAATA TCGGTAATCA ATCGTCGATA GCTTATAAAG TAATGTTAGA TATTTTTTAC 7320 

CCAGAACTGA TTACGATTGG TAGTAACAGT GTTATTGGTT ACAATGTAAC AATTTTGACG 73 80 

CATGAAGCAT TAGTTGATGA ATTTCGTTAT GGACCAGTGA CGATAGGATC TAACACTTTG 74 4 0 

10 

ATTGGTGCAA ATGCTACCAT TTTACCCGGT ATAACGATTG GTGACAATGT AAAAGTTGCA 7500 

GCTGGTACGG TTGTTTCAAA AGATATACCG GATAATGGAT TTGCATATGG CAACCCTATG 7560 

TATATAAAAA TGATTAGGAG GTGACAATTT TATGGCGCAA AAGAATAATA ATGTAATTCC 7620 

is 

AATGACTTTT GATGATGCAT TTTATCGTAA AATGGCTAAA CAGAAGTTTA AACAAAGAGA 76 80 

ATATAAACGA GGTGCTGAAT ACTTTGAAAA AGTGTTAGAA TTGTCACCTG ATGATCTGGA 7740 

AATTCAAATT GATTATGCAC AATGTCTAGT GCAACTTGGT ATTGCTAAAA AAGCAGAACA 7800 

20 

TTTATTTTAT GACAATATTA TTTATAATAG GCATCTAGAA GAT AG CTTTT ATGAATTGAG 7860 

TCAGCTCAAC ATTGAAGTTA ACGAACCAAA CAAGGCATTC TTGTTTGGTA TTAATTATGT 7920 

25 TATTGTTAGC GACGACCAAG ATTATAGAGA TGAATTAGAT CAAATGTTTG ATGTGAAATA 7980 

TCAAAGTGAA GAACAAATTG AACTTGAAGC TCAATTGTTT GTAGTT CAAA TACTATTCCA 804 0 

ATATCTTTTT TCTCAAGGTC GATTAAAAGA TGCAAAGAAT TATGTCTTAC ATCAACCACA 8100 

30 AGAAGTTCAA GATCATCGTG TAGTACGTAA TTTATTGGCA ATGTGTTATT TATATCTCGG 8160 

TGAATATGAT ACgGCTAAAG CATTGTACGA aGCACtATTA CAAGAGGATA GTACaGATAT 8 220 

ATATGCATTA TGCCATTATA CTTTGCTACT TTATAACACT AAGGAAAATG AACAATATCA 8280 

35 AAAATATTTA AAAATATTAA ACAAAGTTGT ACCTATGAAT GACGATGAAA GTTTTAAATT 834 0 

AGGTATTGTA TTAAGTTATT TAAAGCAGTA TCGTGCATCA CAACAATTGT TGTACC CTTT 84 00 

ATATAAAAAA GGGAAATTTT TATCAATTCA AATGTACAAT GCTTTAGCAT ATAATTATTA 8460 

40 

TTATTTAGGT GAAGAAGACG AAAGTCATTA CTACTGGGAT AAATTGAAGC AAATTTCTAA 8520 

AGTGGAAATT GGACATGCGC CTTGGGTAAT TGAAAATAGC AAAGAAGTTT TTGACCAACA 8580 

TATTTTGCCA TTACTTCAAA GTGATGACAG TCATTATCGT TTATATGGTA TTTTTTTATT 864 0 

45 

GGATCAATTA AATGGTAAAG AAATTGTGAT GACGGAAAGT ATTTGGCAGG TTTTGGAAAA 8700 

TCTAAATAAT TATGAGAAAT TGTATTTAAC GTATTTAGTT CAAGGTTTAA CGCTCAATAA 8760 

50 ATTAGACTTC ATTCATCGCG GCTTATTAAC GCTTTAGCAT AATGAATTAT TTGTAAGTGA 8820 

AAATGATGTA ATGGTTGCAT GGATTAATCA AGGTGAACTC ATAATTGCTG AAAAAGTAGA 8 8 80 
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15 



20 



25 
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35 



40 



45 



SO 



TCGAAACGTT 
CAAAATGATT 
TGCGCATAAT 
AGCAATTATC 
TTTAAAAACA 
AGTAGAGAAC 
TGAACACGCT 
TAAAGGCGAA 
TATTGCTACA 
ACGCGGTGTA 
CGTTATCGGT 
CAAAGTAACA 
AGCATTCAAA 
AAAAGACGGC 
ACACGAGGCT 
AGACTTAGGT 
AGTACCAGGT 
TGCTACTGGC 
CGATCAAGCT 
TTTTAATAGT 
ATAGSAAACT 
TATGTTATAT 
TGATTTATTA 
AATACATGAG 
ATCGTTGTCC 
TGGGGCCCCG 
GCCCCGCCAA 
CGCCAACTTG 
CCCAACTCGC 



ACAAAGAAGC 
GAATTTCTCT 
GATTAATAAT 
GGTGCAGGTC 
GTTATGATTG 
TTCCCTGGTT 
AAAAAGTTTG 
TATAAAGTGA 
GGTGCAGAAT 
AGTTATTGTG 
GGTGGTGATT 
ATCGTTCACC 
AATGATAAAA 
AAAGTGGGTT 
GATGGTGTAT 
ATTACAAATG 
ATTTTTGCAG 
GATGGTAGTA 
TAATTCGAAG 
GTCATCACAG 
AGAACTTAGT 
TAAACTTATA 
TGTAGTGGTT 
TAAAACTCAT 
CACCCCAACT 
CCAACTTGCA 
CTTGCACATT 
CATTGTCTGT 
ATTGCCTGTA 



AAATTACAAC 
TGAGCATATA 
GAGGAGGCGT 
CAGCTGGTAT 
AAAGAGGTAT 
TCGAAATGAT 
GTGCAGTTTA 
TTAACTTTGG 
ACAAGAAAAT 
CAGTATGTGA 
CAGCAGTAGA 
GTCGTGATGA 
TCGACTTTAT 
CTGTGACATT 
TCATCTATAT 
ATGTTGGTTA 
CAGGAGATGT 
TTGCAGCGCA 
TCGAATTAAG 
CGTTAAAATA 
ACGTATCATT 
ACTTTATGGG 
CTTAAACATT 
GCATAAGAAA 
TGCACATTAT 
CATTATTGTA 
ATTGTAAGCT 
AGAAATTGGG 
GAATTTCTTT 



ATGGTTAGGC 
GATTTATGAA 
TAATAAAATG 
GACTGCTGCA 
TCCAGGCGGT 
TACAGGTCCA 
TCAATATGGA 
TAATAAAGAA 
TGGTGTTCCG 
TGGTGCATTC 
AGAGGGAACA 
GTTACGTGCA 
TTGGAGTCAT 
AACGTCTACA 
TGGTATGAAA 
TATTGTAACA 
TCGCGACAAA 
AAGTGCAGCG 
ATGTTGAGCT 
ATGTCTTACT 
TGTGCGTTTC 
AGTGGGACAG 
AGCCACAGCT 
TACTAATTTC 
TGTAAGCTGA 
AGCTGACTTT 
GACTTTTCGT 
AATCCAATTT 
TCGAAATTCT 



ATAACACAAT 
AAGTTAGATT 
ACTGAAATAG 
GTATACGCAT 
CAAATGGCTA 
GATTTATCTA 
GATATTAAAT 
TTAACAGCGA 
GGTGAACAAG 
TTTAAAAATA 
TTCTTAACTA 
CAGCGTATTT 
ACTTTGAAAT 
AAAGATGGTT 
CCATTAACAG 
AAAGATGATA 
GGTTTACGCC 
GAATATATTG 
GTAAATTATT 
TTTAAATTAA 
AATGAGTTCT 
AAA7GATAAA 
AATGTGTACT 
TATAGAAAAA 
CTTTCCGCCA 
TCGTCAgCTT 
CAGCTTCTGT 
CTCTATGTTG 
CTGTGTTGGG 



ATAAACTGAA 
TATTATATAA 
ATTTTGATAT 
CACGTGCTAA 
ATACAGAAGA 
CAAAAATGTT 
CTGTAGAAGA 
AAGCGGTTAT 
AACTTGGTGG 
AACGCCTATT 
AATTTGCTGA 
TACAAGATAG 
CAATTAATGA 
CAGAAGAAAC 
CGCCATTTAA 
TGACAACATC 
AAATTGTCAC 
AACATTTAAA 
TGGATATTTA 
AGCAAATTAT 
AGTTTTTTTA 
GAGCCACTAA 
TAAAAATAGG 
GTATTACTTT 
GCTTCTGTGT 
CTGTGTTGGG 
GTTGGGGCCC 
GGGCCCACAC 
GCCCACACCC 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
.9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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ACTCGCATTG CCTGTAGAAT TTCTTTTCGA AATTCTCTGT GTTGGGGCCC CTGACTAGAG 1080 0 

TTGAAAAAAG CTTGTTGCAA GCGCATTTTC ATTCAGTCAA CTACTAGCAA TATAATATTA 1066 0 

5 TAGACCCTAG GACATTGATT TATGTCCCAA GCTCCTTTTA AATGATGTAT ATTTTTAGAA 10920 

ATTTAATCTA GACATAGTTG GAAATAAATA TAAAACATCG TTGCTTAATT TTGTCATAGA 10930 

ACATTTAAAT TAACATCATG AAATTCGTTT TGGCGGTGAA AAAATAATGG ATAATAATGA 1104 0 

10 , 

AAAAGAAAAA AGTAAAAGTG AACTATTAGT TGTAACAGGT TTATCTGG CG CAGGTAAATC 11100 

TTTGGTTATT CAATGTTTAG AAGACATGGG ATATTTTTGT GTAGATAATC TACCACCAGT 11160 

GTTATTGCCT AAATTTGTAG AGTTGATGGA ACAAGGAAAT CCATCCTTAA GAAAAGTGGC 11220 

IS 

AATTGCAATT GATTTAAGAG GTAAGGAACT ATTTAATTCA TTAGTTGCAG TAGTGGATAA 112 80 

AGTCAAAAGT GAAAGTGACG T CAT CATTG A TGTTATGTTT TTAGAAGCAA GTACTGAAAA 1134 0 

ATTAATTTCA AGATATAAGG AAACGCGTCG TGCACATCCT TTGATGGAAC AAGGTAAAAG 114 00 

ATCGTTAATC AATGCAATTA ATGATGAGCG AGAGCATTTG TCTCAAATTA GAAGTATAGC 11460 

TAATTTTGTT ATAGATACTA CAAAGTTATC ACCTAAAGAA TTAAAAGAAC GCATTCGTCG 1152 0 

25 ATACTATGAA GATGAAGAGT TTGAAACTTT TACAATTAAT GTCACAAGTT TCGGTTTTAA 11580 

ACATGGGATT CAGATGGATG CAGATTTAGT ATTTGATGTA CGATTTTTAC CAAATCCATA 11640 

TTATGTAGTA GATTTAAGAC CTTTAACAGG ATTAGATAAA GACGTTTATA ATTATGTTAT 1170 0 

30 GAAATGGAAA GAGACGGAGA TTTTCTTTGA AAAATTAACT GATTTGTTAG ATTTTATGAT 1176 0 

ACCCGGGTAT AAAAAAGAAG GGAAATCTCA ATTAGTAATT GCCATCGGTT GTACGGGTGG 11820 

ACAACATCGA TCTGTAGCAT TAGCAGAACG ACTAGGTAAT TATCTAAATG AAGTATTTGA 1188 0 

35 

ATATAATGTT TATGTGCATC ATAGGGACGC ACATATTGAA AGTGG CGAGA AAAAATGAGA 11940 

CAAATAAAAG TTGTACTTAT CGGTGGTGGC ACTGGCTTAT CAGTTATGGC TAGGGGATTA 12000 

AGAGAATTCC CAATTGATAT TACGGCGATT GTAACAGTTG CTGATAATGG TGGGAGTACA 1206 0 

40 

GGGAAAATCa GAGATGAAAT GGATATACCA GCACCAGGAG ACATCAGAAA TGTGATTGCA 12120 

GCTTTAAGTG ATTCTGAGTC AGTTTTAAGC CAACTTTTTC AGTATCG CTT TGAAGAAAAT 12180 

4S CAAATTAGCG GTCACTCATT AGGTAATTTA TTAATCGCAG GTATGACTAA TATTACGAAT 1224 0 

GATTTCGGAC ATGCCATTAA AGCATTAAGT AAAATTTTAA ATATTAAAGG TAGAGTCATT 12300 

CCATCTACAA ATACAAGTGT GGAATTAAAT GCTGTTATGG AAGATGGAGA AATTGTTTTT 12360 

SO GGAGAAACAA ATATTCCTAA AAAACATAAA AAAATTGATC GTGTGTTTTT AGAACCTAAC 124 20 

GATGTGCAAC CAATGGAAGA AGCAATCGAT GCTTTAAGGG AAGCAGATTT AATCGTTCTT 12480 
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GCGTTAATTC ATTCTGATGC GCCTAAGCTA TATGTTTCTA ATGTGATGAC GCAACCTGGG 126 00 

GAAACAGATG GTTATAGCGT GAAAGATyAT ATCGATGCGA TTCATAGACA AGCTGGACAA 12660 

CCGTTTATTG ATTATGTCAT TTGTAGTACA CAAACTTTCA ATG CTCAAGT TTTGAAAAAA 12720 

TATGAAGAAA AACATTCTAA ACCAGTTGAA GTTAATAAGG CTGAACTTGA AAAAGAAAGC 12780 

ATAAATGTAA AAACATCTTC AAATTTAGTT GAAATTTCTG AAAATCATTT AGTAAGACAT 12840 

AATACTAAAG TGTTATCGAC AATGATTTAT GACATAGCTT TAGAATTAAT TAGTACTATT 12900 

CCTTTCGTAC CAAGTGATAA ACGTnAATAA TATAGAACGT AATCATATTA TGATATGATA 12960 

ATAGAGCTGT GAAAAAAATG AAnATAGACA GTGGTTCTAA GGTGAATCAT GTTTTAAATA 13020 

AGAAAGGAAT GACTGTACGA TGAGCTTTGC ATCAGAAATG AAAAATGAAT TAACTAGAAT 13080 
AGACGT 

20 (2) INFORMATION FOR SEQ ID NO: 17: 

(iJ SEQUENCE CHARACTERISTICS - 

(A) LENGTH: 1350 base pairs 

(B) TYPE: nucleic acid 
<C> STRANDEDNESS : double 

25 (D) TOPOLOGY: linear 



15 



30 



35 



40 



50 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 

CATTAGTCAT GAAAATAGCC GACAACTTCA TCTGTGAAAT CACCGGCCTT TTATTTTAGC 60 

TAACTTTATT TCTGATTTTA CGATTTTAAT TGATCATACA GAGAAAGTGA TCTTTTTACA 120 

ATTTCTAAAA ACTCATGATC TATATTGGAC ATTTGATGAA AATAAGACAA AATGTTTTCT 180 

GTTAGCTTCT CTTGTTTTGG GAATGAATCA TCTTCTTTAA TCCAAATCGC TAATTCGCCT 240 

AATGGTGTTT TATCATCTTT AAATGTTTGT ATATATTCGT AAAAGCTCAT AGTATTCCTT 300 

CTCTCAATTT ACTTATATAA ATCCTACCAC GAAAGCTTTC AAGAAAACAC AATTAAATGT 360 

CTATTTAGTG AACTTTTTAA GGTTGTGCAC TCTTTTAATG TCTGCCAATT AGGTCAATTA 420 

ATCATCACAA TGTACAATTA ACTCTATTTT CAGTTCATAT ACTCACACAC CGTTTTTGAA 480 

45 CAACACATTA ACTTCTCATT TAGATAAAAC GCAAAAAAGC CTGGCACCAA TACAATAGAT 540 

GCCAGACTAA GAGTCTACTA TATAAATTTA TTTAGCGTAT GGTTTTACTT CGATTGCACC 600 

TTCATTTTCA TCATGAACAC CATGCTTATA ATAATCAATA TATTGTGGCT CTAAAGGCTT 660 

TCTGCCACGT ATAATGTCTG CTGCTTTTTC AGCTAACATT AAAACAGGTG CGTGTATATT 720 

GCCATTTGTC GTACGTGGCA TAG CTGATGC ATCAACTACA CGTAAATTTT CCATACCGTG 78 0 
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ACTACAAGAT 


GGGTGTAATG 


CTGTTTCACC 


ATCTCTACGA AC CCAATCAA 


GAATTTCTTC 


900 




GTCTGTTTGC 


ACTTCTGGTC 


CTGGTGAAAT 


TTCTCCACCA TTGAATGGAT 


CCATTGCTTT 


960 


5 


TTGAGATAAG 


ATATTTCTTG 


CTACACGAAT 


TGCTTCTACC CATTCTTTTT 


TATCTTCTTC 


1020 




TGTTGATAAA 


TAATTAAAGC 


GGATACTTGG. 


TTTTTCGAAT GGATCTTTAG 


ATTTGATTTT 


1080 


10 


CAAGCTACCA 


CGAGAGTTTG 


AATACATTGG 


TCCTACGTGA ACTTGATAAC 


CATGTGCGAC 


1140 


CGCTGCCTTT 


TGACCATCAT 


ATCTTACAGC 


TATTGGTAAG AAATGGAACA 


TTAAGTTAGG 


1200 




ATAAtL CAACT 


TCGTTATTTG 


AACGTACAAA 


TCCGCCACCT TCAAAATGGT 


TAGATGCTGC 


1260 


15 


TGCACCTGTA 


CGTGTGAAAA 


TCCATTGTAA 


ACCAATAAAT GGcATGCGCT 


TGAtATCTAA 


1320 




GCTTGGCtGt 


AATGATACAG 


GTTCCTTACA 






1350 




(2) INFORMATION FOR SEQ ID NO: 18: 






20 
25 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 






(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 








TAATGCTATT 


GGCAACACCA 


TATATGAAAn 


CTCCAAACGA TCCTAAACCG 


ACTATAGATT 


60 


30 


CACCAAATTT 


nACAATCCAT 


GAATAAAGTA 


GTGGCCATAA GAATAACAAT 


ATGACAACTA 


120 




AAAATGTACA 


GTAAAATGCA 


GTCATAATTG 


GAACTAGACG TTTACCACTA 


AAAAATGATA 


180 


35 


ATG CTAATGG 


TAATTCTGTT 


TCACTAAACT 


TATTGTATGC ATAAGCTGCT 


ATTAAACCTA 


240 


TTACAATACC 


AACAAAGACA 


TTGCCATTAT 


TCATCTTTTC AAAAGCTGAA 


TTTATTTCCG 


300 




ArGCTTTCAT 


TCCTAATAAA 


GGCGCTAATT 


TCATTGGTGA TAATACAACT 


GTAACTAAAA 


360 


40 


AATATCCTAA 


CGTrGCTGCA 


rGCGsGACTG 


CACCATCATT TTTCTTTGCC 


ATTCCTATAG 


420 




CTACACCAAT 


TGCAAATAAA 


ATACCTAATT 


GCTCTAAAAT CGTAGTACCT 


ACCGTAGTAA 


480 




AGAACATTGC 


GATTTTCGGC 


GTCGCATGAA 


GTGCATTTAA CGTATTACCA 


ATTCCGGCAA 


540 


45 


TAATTGCTGC 


AGCCGGTAAA 


ATGGCAACTG 


GTAACATTAA CGAACGCCCT 


AAATTTTGGA 


600 




AAAATTTATA 


CATTGAATGT 


CATCCTTCTT 


AAAATAATGT AGAAATATAA 


AGATTACTAA 


660 




TGTAACTAGA 


ATAACTACTT 


CGATACTCCG 


TTATAGTCAC CTAGGCTTAC 


TAACCAGCTA 


720 


SO 


TATTTCTACC 


TCAAGTTATT 


TTATAAACTT 


TTTACAATTT CATGCAATTC 


TTGTTGTAAC 


780 




TTTGCTGTTC 


GTGTTTCAAT 


CTCTTTTGTA 


ATATAATCGA TACGCTCGTT 


1 TCGTTTTAAA 


840 
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AAAGACCGTG AATCTTAGTA GGACCAACAT AAGCAACAGG TAATATTGGT GACTTACTTA 
ACATTGCAAT TGTTGAAGCA CCaCGTTTCA AAGGTGCACC TTCTTGCGAT GTGCGAGAAC 
CTGTTGGGAA GATACCAACT GTCTTATTAT CTTTCAACAA ATTGATTGGG CGTTTTAAAG 
TACTAGGTCC TGGATTTTCA CGATCTACAG GAAATGCATT TAAAGACGTT AAAAATTTAC 
CAATCCATTT ATTTTTGAAT AATTCTTTTT TAGCCATATA ATGAATTTGA TTAGGATATA 
ATGCCATACC TAGCATAATG ACTTCGTTAT AACTTTCATG CGTACAAGTT ACGACATATT 1260 
TACTATCCTT AGGAATATTA TCTTTACCGA TTACGTATAA TGATTTTGAC ATTTTAACTA 132 0 

^ AAATGAAATT CAAAATCTTA CTAATCACTG AATACATTGT GCCACCTACT TAACTT 1376 

(2) INFORMATION FOR SEQ ID NO: 19: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7363 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 
<D) TOPOLOGY: linear 



25 . Cxi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

TTGTCATACC AATATTTTGT AAAATATGGA ACACAAGTAA AGTGACGAAA CCAACGATAA 6 0 

AGATTTTGTT AAATTGATCT TCAATTTTCG CAGCTAATCT TATTAGATGG AAGATTAAAA 120 

ATAAAAATAT TAAGATCAAT ATGACAGAAC CGATAAAGCC AAGTTCCTCT CCAATCACTG 180 

AAAAGATAAA GTCAGTATGA TTTTCAGGTA TATAAACTTC ACCGTGATTG TATCCTTTAC 240 

CTAGTAACTG TCCAGAACCG ATAGCTTTAA GTGATTCAGT TAAATGaTAG CCATCACCAC 300 

TACTATATGT ATAGGGGTCA AG CCATG AAT TGATTCGTCC CATTTGATAC AGTTGGaCAC 3 60 

CTAATAAATT TTCAATTAAT GCGGGTGCAT ATAGaATACC TAAAATGACT GTCATTGCAC 420 

CAACaATACC TGTAATAAAG ATAGGTGCTA AGATACGCCA TGTTATACCA CTTACTAACA 4 80 

TCACACCTGC AATAATAGCA GCTAATACTA ATGTAGTTCC TAGGTCATTT TGCAGTAATA 540 

TTAAAATACT TGGTACTAAC GAGACACCAA TAATTTTGAA AAATAATAAC AAATCACTTT 6 00 

GGAATGATTT ATTGAATGTG AATTGATTAT GTCTAGAAAC GACACGCGCT AATGCTAAAA 660 

TTAAAATAAT TTTCATGAAT TCAGATGGCT GAATACTGAT AGGGCCAAAC GTGTACCAAC 720 

TTTTGGCACC ATTGATAATA GGTGTAATAG GTGACTCAGG AATAACGAGC AAGCCTATTA 7 80 

SO ATAATAGACA GATTAAGAAA TACAATAAAT ATGTATAATG TTTAATCTTT TTAGGTGAAA 840 

TAAACATGAT GATACCTGCA AAAATTGCAC CTAAAATGTA ATAAAAAATT TGTCTGATAC 900 
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TTGCTAAAAC 


AGCTATAGTG 


GCTACTAATA 


CCCAGTCTAC 


TTTGCGAAnC 


aATGCTTATC 


1020 




CGGCTGTTGA 


CGAGATGAAT 


AATTCATTGC 


AAACTCCTTT 


TATACTCACT 


AATGTTTATA 


1080 


5 


TCAATTTTAC 


ATGACTTTTT 


AAAAATTAGC 


TAGAATATCA 


CAGTGATATC 


AGCTATAGAT 


1140 




TTCAATTTGA 


ATTAGGAATA 


AAATAGAAGG 


GAATATTGTT 


CTGATTATAA 


ATGAATCAAC 


1200 


1/) 

1U 


ATAGATACAG 


ACACATAAGT 


CCTCGTTTTT 


AAAATGCAAA 


ATAGCATTAA 


AATGTGATAC 


1260 


TATTAAGATT 


CAAAGATGCG 


AATAAATCAA 


TTAACAATAG 


GACyAAATCA 


ATATTAATTT 


1320 




ATATTAAGGT 


AGCAAACCCT 


GATATATCAT 


TGGAGGAAAA 


CGAAATGACA 


AAAGAAAATA 


1380 


15 


TTTGTATCGT 


TTTTGGAGGG 


AAAAGTGCAG 


AACACGAAGT 


ATCGATTCTG 


ACAGCACAAA 


1440 


ATGTATTAAA 


TGCAATAGAT 


AAAGACAAAT 


ATCATGTTGA 


TATCATTTAT 


ATTACCAATG 


1500 




ATGGTGATTG 


GAGAAAGCAA 


AATAATATTA 


CAGCTGAAAT 


TAAAT CT ACT 


GATGAGCTTC 


1560 


20 


ATTTAGAAAA 


TGGAGAGGCG 


CTTGAGATTT 


CACAGCTATT 


GAAAGAAAGT 


AGTT CAGGAC 


1620 


- 


AACCATACGA 


TGCAGTATTC 


CCATTATTAC 


ATGGT CCTAA 


TGGTGAAGAT 


GGCACGATTC 


1680 




AAGGGCTTTT 


TGAAGTTTTG 


GATGTACCAT 


ATGTAGGAAA 


TGGTGTATTG 


TCAGCTGCAA 


1740 


25 


GTTCTATGGA 


CAAACTTGTA 


ATGAAACAAT 


TATTTGAACA TCGAGGGTTA 


C CACAGTT AC 


1800 




CTTATATTAG 


TTTCTTACGT 


TCTGAATATG 


AAAAATATGA 


ACATAACATT 


TTAAAATTAG 


1860 




TAAATGATAA 


ATTAAATTAC 


CCAGTCTTTG 


TTAAACCTGC 


TAACTTAGGG 


TCAAGTGTAG 


1920 


30 


GTATCAGTAA 


ATGTAATAAT 


GAAGCGGAAC 


TTAAAGAAGG 


TATTAAAGAA 


GCATTCCAAT 


1980 




TTGACCGTAA 


G CTTGTTAT A 


GAAGAAGGCG 


TTAACGCACG 


TGAAATTGAA 


GTAG CAGTTT 


2040 




TAGGAAATGA 


CTATCCTGAA 


GCGACATGGC 


CAGGTGAAGT 


CGTAAAAGAT 


GTCGCGTTTT 


2100 


35 


ACGATTACAA 


ATCAAAATAT 


AAAGATGGTA 


AGGTTCAATT 


ACAAATTCCA 


GCTGACTTAG 


2160 




ACGAAGATGT TCAATTAACG 


CTTAGAAATA 


TGGCATTAGA 


GGCATTCAAA 


GCGACAGATT 


2220 


40 


GTTCTGGTTT 


AGTCCGTGCT 


GATTTCTTTG 


TAACAGAAGA 


CAACCAAATA 


TATATTAATG 


2280 


AAACAAATGC 


AATGCCTGGA 


TTTACGGCTT 


TCAGTATGTA 


TCCAAAGTTA 


TGGGAAAATA 


2340 




TGGGCTTATC 


TTATCCAGAA 


TTGATTACAA 


AACTTATCGA 


GCTTGCTAAA 


GAACGTCACC 


2400 


45 


AGGATAAACA 


GAAAAATAAA 


TACAAAATTG 


ACTAACTGAG 


GTTGTTATTA 


TGATTAATGT 


2460 




TACATTAAAG 


CAAATTCAAT 


CATGGATTCC 


TTGTGAAATT 


GAAGATCAAT 


TTTTAAATCA 


2520 




AGAGATAAAT 


GGAGTCACAA 


TTGATTCACG 


AGCAATTTCT 


AAAAATATGT 


TATTTATACC 


2580 


SO 


ATTTAAAGGT 


GAAAATGTTG 


ACGGTCATCG 


CTTTGTCTCT 


AAAGCATTAC 


AAGATGGTGC 


2640 




TGGGGCTGCT 


TTTTATCAAA 


GAGGGACACC 


TATAGATGAA 


AATGTAAGCG 


GGCCTATTAT 


2700 
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AAACCCTAAA GTAATTGCCG TCACAGGGTC TAATGGTAAA ACAACGACTA AAGATATGAT 2820 

TGAAAGTGTA TTGCATACCG AATTTAAAGT TAAGAAAACG CAAGGTAATT ACAATAATGA 2 880 

5 

AATTGGTTTA CCTTTAACTA TTTTGGAATT AGATAATGAT ACTGAAATAT CAATATTGGA 294 0 

GATGGGGATG TCAGGTTTCC ATGAAATTGA ATTTCTGTCA AACCTCGCTC AACCAGATAT 3000 

TGCAGTTATA ACTAATATTG GTGAGTCACA TATGCAAGAT TTAGGTTCGC GCGAGGGGAT 3060 

10 

TGCTAAAGCT AAATCTGAAA TTACAATAGG TCTAAAAGAT AATGGTACGT TTATATATGA 3120 

TGG CGATGAA CCATTATTGA AAC CACATGT TAAAGAAGTT GAAAATGCAA AATGTATTAG 3180 

1S TATTGGTGTT GCTACTGATA ATG CATTAGT TTGTTCTGTT GATGATAGAG ATACTACAGG 324 0 

TATTTCATTT ACGATTAATA ATAAAGAACA TTACGATCTG CCAATATTAG GAAAGCATAA 3300 

TATGAAAAAT GCGACGATTG CCATTGCGGT TGGTCATGAA TTAGGTTTGA CATATAACAC 3360 

20 AATCTATCAA AATTTAAAAA ATGTCAGCTT AACTGGTATG CGTATGGAAC AACATACATT 3420 

AGAAAATGAT ATTACTGTGA TAAATGATGC CTATAATGCA AGTCCTACAA GTATGAGAGC 34 80 

AGCTATTGAT ACACTGAGTA CTTTGACAGG GCGTCGCATT CTAATTTTAG GAGATGTTTT 354 0 

25 AGAATTAGGT GAAAATAGCA AAGAAATGCA TATCGGTGTA GGTAATTATT TAGAAGAAAA 3600 

GCATATAGAT GTGTTGTATA CGTTTGGTAA TGAAGCGAAG TATATTTATG ATTCGGGCCA 3660 

GCAACATGTC GAAAAAGCAC AACACTTCAA TTCTAAAGAC GATATGATAG AAGTTTTAAT 3720 

30 

AAACGATTTA AAAGCGCATG ACCGTGTATT AGTTAAAGGA TCACGTGGTA TGAAATTAGA 37 8 0 

AGAAGTGGTA AATGCTTTAA TTT CAT AG AG ATTAGTCGAG GGACCTTTTA CTTATAAAAA 384 0 

TGATTTGAAT TAATACTAAA AGATTACAAA GAAGAGGTGG TTTTGTGTGT AAATACAAAA 3 900 

35 

TTGCCTTTTT CTTTTTATGT TAAATCTATA AATTTGAAAC TAAATCAAGG TTAATTCTAT 3960 

GTACACACTT TATATAGGAA GTAGTTTGAA TGTTTATATA ATGTTTTACA AAAAGATGTA 4 020 

GTATTATAAT GTCTAATTTC ACATGTGTTT CAGTAAAATT TGTTGTGGAA TGTTAACGAT 4080 

40 

ATACGTATTT TATAAAAaAT TTTTTATAAT GATTATTCGA ATGATGCGTA ACGCTTACAT 414 0 

CTTATCTAAT GCTAGCTTTT TGACAAAAAT ATGACAATCA ATTAATGTGA TTCTAATAAA 4200 

4S TATTCGCAAA TTGCTTTATT GCGATTAAAT TTTTTTGGTG GTACTATATA GAAGTTGATG 4260 

AAATATTAAT GAACTTATAT GCAAAAGTAT ATTGAGAAAT AAACAGGTAA AAAGGAGAAT 4 32 0 

TATTTTGCAA AATTTTAAAG AACTAGGGAT TTCGGATAAT ACGGTTCAGT CACTTGAATC 4 3 80 

SO AATGGGATTT AAAGAGCCGA CACCTATCCA AAAAGACAGT ATCCCTTATG CGTTACAAGG 444 0 

AATTGATATC CTTGGGCAAG CTCAAACCGG TACAGGTAAA ACAGGAGCAT TCGGTATTCC 4 500 
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AGAATTGGCA ATGCAGGTAG CTGAACAATT AAGAGAATTT AGCCGTGGAC AAGGTGTCCA 4620 

AGTTGTTACT GTATTCGGTG GTATGCCTAT CGAACGCCAA ATTAAAGCCT TGAAAAAAGG 4 680 

CCCACAAATC GTAGTCGGAA CACCTGGGCG TGTTATCGAC CATTTAAATC GTCGCACATT 4 74 0 

AAAAACGGAC GGAATTCATA CTTTGATTTT AGATGAAGCT GATGAAATGA TGAATATGGG 4 800 

ATTCATCGAT GATATGAGAT TTATTATGGA TAAAATTCCA GCAGTACAAC GTCAAACAAT 4 860 

GTTGTTCTCA GCTACAATGC CTAAAGCAAT CCAAGCTTTA GTACAACAAT TTATGAAATC 4 920 

ACCAAAAATC ATTAAGACAA TGAATAATGA AATGTCTGAT CCACAAATCG AAGAATTCTA 4 980 

TACAATTGTT AAAGAATTAG AGAAATTTGA TACATTTACA AATTTCCTAG ATGTTCATCA 5040 

ACCTGAATTA GCAATCGTAT TCGGACGTAC AAAACGTCGT GTTGATGAAT TAACAAGTGC 5100 

TTTGATTTCT AAAGGATATA AAGCTGAAGG TTTACATGGT GATATTACAC AAGCGAAACg 5160 

TTtAGAAGTA TTanAGAAAT TTAAAAATGA CCAAATTAAT ATTTTAGTCG CTACTGATGT 5220 

AGCAGCaAGA GGACTAGATA TTTCTGGTGT GAGTCATGTT TATAACTTTG ATATACCTCA 5280 

AGATACTGAA AGCTATACAC ACCGTATTGG TCGTACGGGT CGTGCTGGTA AAGAAGGTAT 5340. 

25 CGCTGTAACG TTTGTTAATC CAATCGAAAT GGATTATATC AGACAAATTG AAGATGCAAA 5400 

CGGTAGAAAA ATGAGTGCAy TcGTCCACCA CATCGTAAAG AAGTACTTCA AGCACGTGAA 5460 

GATGACATCA AAGAAAAAGT TGAAAACTGG ATGTCTAAAG AGTCAGAATC ACGCTTGAAA 5520 

30 CGCATTTCTA CAGAGTTGTT AAATGAATAT AACGATGTTG ATTTAGTTG C TGCACTTTTA 5580 

CAAGAGTTAG TAGAAGCAAA CGATGAAGTT GAAGTTCAAT TAACTTTTGA AAAACCATTA 5640 

TCTCGCAAAG GCCGTAACGG TAAACCAAGT GGTTCTCGTA ACAGAAATAG TAAGCGTGGT 57 00 

AATC CTAAAT TTGACAGTAA GAGTAAACGT TCAAAAGGAT ACTCAAGTAA GAAGAAAAGT 5760 

ACAAAAAAAT TCGAC CGTAA AGAGAAGAGC AGCGGTGGAA GCAGACCTAT GAAAGGTCGC 5820 

ACATTTGCTG ACCATCAAAA ATAATTTATA GATTAAGAGC TTAAAGATGT AATGTCTTGA 5880 

GCTCTTTTTT GTTTTCAATA ATTGATTCTC TGTAGATATC aAAGTaCTAA CGTTTTAAAG 5940 

GTTAAATATT TAATTGGATT GAGATCTGTA TG CGGTT AT A TCaTTCTGTG T AAAT ATGGT 6000 

TCTCCACCAA ATGTGGTGAG TATATAATTT AAAGAACTAT TTTTAAATTA AGAATAATCG 6060 

AACATAAATA AACTTTATGA AATTT CAGT A TCATGTTCTT ATAAAAAACA ATAGGGCTTT 6120 

TTGctGACGC TAGTGCGCGA TAAATAATAA GTTGAATATA AAAAAGATCA CTGCCAATCA 6180 

50 TTCGTTTAAT GGCAGCGATC TTTTTTATTT AATTATTTCT CTTTCCACTG CAACATTTGA 624 0 

TAACCAATGC GTGGATGTGT TTTAATAATA TCTTTTGCGT CCTCATGACA TTGTGAAAGT 6300 
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CCATATATTC GTTTTAATAT CATCTCATAA GTGAGTACTT TTCCTTTATG ATTTGACAAT 

AGTTCTAACA AGCTAAATTC ATTTGG CGTC AAATGTACCT CCTGATTATT AATAACAACA 

GATTTGGAGC CAAAGTCGAT GCTTAGCAAA CCGTTAGTAA ATACAATGTT AGTTTCTTGA 6540 

TGTGACTTAG CGATTCTCTC GATGACTCGT ATTCGTGCCC GAAGCTCATC AACATTAAAA 6600 

GGTTTAGTCA TATAGTCATT CGCACCGTTA TCTAAAGCTT GAATAATTGT TTGTTCTTCT 6660 

TGTCTTGCAC TTATTACAAT GATAGGAATG TCAGTATGTT GCCTGATTTC TGAAATCAAA 6720 

CATAATCCAT CTTTATCTGG TAAACCTAAA TCTAATAAAA TGACATCTGG TTTATCAATT 6780 

TGAATTTTAA AGTGTGCTTG TGTGGCATTG TCGGCTGTAG TTACATTGTA ATAATCTAAA 684 0 

GTTAATGCAA CATCAAGTAA ATGTGTGATT GCGTGATCAT CTTCAATTAT CAATATTTTA 6900 

GATTGCATTA TACGTCTCCT TCGTTAAAGT CTGTATATAT ATTGAAATAG AATATACTGC 

20 CGTGTGGTTG GTTCGGTTTA TATTGTAAGT TTGATTGATG TTTGTGTAGG ATAGTCTGTA 

CTAAATATAA GCCTAGTCCC ATGCTTTCTT TTTGGTTATC TTTAAAATAT TTATTTGATC 7080 

CTGTGTAAAA AGGCTCGAAT ATCTTTTGTt GTTCTTCTAA ACTAATTCCA GGTCCTTCGT 714 0 

CTATAACGGC AAATTCGATT TGTTCATAGC TAGCATAACG AATAGATAAA TTGATTTTGG 7200 

TGTCAGTAGA AGTGTGTTTA ACTGCATTTT CAATCAAATT GAAtAAAgCT TGTAAAATCA 7260 

ACTTACTGTC AATGTGTATA AACtGTAAAT TTACTGAGGA TGATACAGTT ATACGCTTTT 732 0 

TTAAATGGCG ACGTTCTAAA AT ACAT AT CG ATTTCTTATA CTA 73 63 

(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 
& (A) LENGTH: 10470 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

40 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 

TTAACAATCG ATAACCACAA TACTTCTATT GTAATTGTTT AACGATTTCn CGATTAAAAT 60 
45 CATCTAAATC GTCTGGTACT CGACTTGTTA CAATATTGTT GTCTACAcTa CTGACTCATC 120 

AACTACATGT GCGCCTGCAT TTGATAAATC TTTGCGTACA TTTAATACTG CTGTTAACGT 180 

ACGACCTTTT AAATCGTCTG TATCTATTAG TATTTGTGGC CCATGACAAA TGGCAAATGT 24 0 

50 TGGTACATCA TTTTTAGTAA AGTATTTAGC AAATGTGCCA TATCGACCTT CTGTATCTCC 3 00 

ACGTAAATGA TCTGGTGAAA ATCCTCCAGG AATTAATAAT GCATCATAAT CTTCTGGTTT 3 60 
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ATTTGCAGTA TCTCCAATCA CTACAGTATT AAAGCCTGCA TTCTCTAATG CCTCTTTAGG 4 80 

GCTTGAATAT TCTATATCTT CAAATTCGTT TGCTAGAATA ATTGCTACTT TTTTAGTCAT 54 0 

TGAAAATCAC CTTTCTATAT ATCATTGATA TAATTACTAT AGACAAGTAA AT CAGTG ATT 60 0 

AAACATACAA GATATAAAAA ATATTAAGCG ACTGTCGCGA TATCTAACCC TAACACATCT 660 

TATGTGGCAT TTACTTAGAT ACTAATTTAA CCTTTTCTTC AAG CTGATCT AACAATCCAA 72 0 

TCCATTCATC TATATCTTCA ACACGTACTT CATCAGGATT TACATGATCG ATATCCTCAA 780 

TAAACTTATT TAAACGCGCT TTTATCTGTT CGATTGTTTG CTGTTCATTC ATAAAAAGTT 84 0 

AACTCCTTTT ATTTTGTTTT CTTTTTCATT ATTATCCTAA CAGAAATTGC GTTAAAGCGA 900 

TATAATCTTA GCTATATTTA TGACATTCAA ATTATTTTGA CTTTTAAAAA TCCCCTTTTC 960 

AATTAACTAA AATTAAGAGA TAATTTGTTA CGAGTGATAA TACGAaGkGG TaTCATACCG 1020 

2Q ATATGAACCA AATAGAAAGA AGGAAGTTTA AGACGATGAA TAGCGTCAAA TTGAAGCAAC 1080 

CTGTT AG CAT TTACAATGAT CCATGGGAAG TGAAATTTAT ATACATTTAA ATTTCATGAG 114 0 

ACAATAAACG TTGATTTAAT GCG TTTT TTT GCCTTTTTTA TTTTCCTTAT TTTTTCTGTT 12 00 

25 TTACAACAAA ATGGTAT CAA AAATGGTATC ATTTGTAGTT ATTTTAGCTT CACATATTAA 1260 

AACAACCACA CTCCTAAATT AATAGGTGGT GTGGTTTTGT TGGTTGTGTG GGGATAAAAA 1320 

TAACCGCATC AGTTAAGATG CGGTTATCTA GCAAGGGCCA CGTATTTATA AATACGTTTA 1380 

30 GAATCTCTTC GG CAACTTTG CTATAGACAG TCTATGCTGT TACTAAATTA TACCACCACA 144 0 

CAAACCTACT CCCATTCAGG AACACAGAGC TTTGTCGCTC GTCAGCAACG TCATATGAAT 1500 

TCTCAGTTCA TGTTGTGGTG ACACTTTAAA CGGTCTGTGC CAGTAGCGAC CGAGTCATTT 1560 

CAAGAATGAC CATTTCACAT TTATATTATA ACACTTGTCG TGCGTAACTG TATAGTTTTT 1620 
CAGTTGTATT TAAAGTTAAG TTATCTACTT CGCGCTTTCC TTGCCTTAAT TGTGAAATTA . 1680 

CATATTGCGC TACGCCAGTT TGTTTGTGAA TTTGGTAACC TGTTATATCA CTTTTGATCA 1740 

ATTCAATTAT TTTTAATTTA TAATCACTCA TATTATCTAC GTCCATTCTT TTTATCTAAA 1800 

CAATAAAAAT GTGTCTTTCT CCCGATAAAT AATAACAATG GTAGGCTTAA TAAAAACAAT 1860 

ATTAAATACA TTTGTTCTGT CATAATTGAA AACCTCCAAA TAATATTATA TTATATAAGT 1920 

GTAAGGAGGA GCCATCAGGC TCCAAGCATA ATGTTAATCT TTGTTGTTTG GCTTTCGGTC 1980 

TAGGTAGCCG AGATGCCaTT CTCTAAGTTG TTTTAACACT TCTGGAATTA TCAGTACTGC 204 0 

SO CAATACTTGA TGTTCTAGAA GTGTTTTTAT TATGTCTAGC ATGAGGCTTT TCACCTCCTT 2100 

ACACATAATT TGTAAGTCAT CAACTAACCT ACAAATATAA TTATACTAAA CAAATGTTTA 2160 
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GTTATCTACA TTTAAATCTT GAGAGAAATG TTAAAAAGTT CTAGTAAAAT AATAGCACAT 2280 

TTTATCTTTA AATGTAAATA GAAAGCAGGT ATGTAACGCA CCTGCTTAAA TAG a CATGAC 234 0 

TATGTCATTC TAACTGATTT CTCCCCATAA GTCACCTAAT ATCTGATTAG GTGGGGCAGA 2400 

ACCATTCCAT GTTCTAATAG GCAAGTAATA ACGTTGCCCC TCCCATGTAT ATCCTACCCA 24 60 

AACATGACCA TCTTGTAACA TCACTTCTGT ATAATCACAA TACCCACCAG GTTGGAACTG 2520 

ATAACCCACT GGACAAGATA AGAATGGCCC CACTTTTCTT ACTGTGATTG GTTGATTGCC 2580 

GTTTGTGAAT CTAGCACTTT CTTCCATGTA GTAAGTACCA TATTTATTAC GTTTC CATGC 2640 

ACTTGCAACT GGTTTAACTG TATTACTTGA AGCGCTTGAC TCATTAGAGA CAGTGGCAAC 2700 

CGGTATTTTA CCATCCATGT ACGCCCTAAT CTGCTTGATA AAGTAGTCTT TAAGTTGCAA 2760 

CCGCTTGTCT TCTGGCAATA GACCGCGAGT TACTGGGTCA AAACCAGTGT GTAAAACCGA 2820 

20 ACTTCTATGA GGGCATGATG TTGAAGTAAA TTCATTGTGC AATCTGATTG TATTTCTGTT 2B80 

TGCTGGTAAT CCCCATTTTT TCAACAATCT AGCG CATTCT TGGAAAGTTG CCTGTTCATT 2940 

TTTTAAGAAT GTCGCGTTAT CTGCGCCCAT TGATTGACAT ACTTCAATAC CGTAATAATA 3000 

25 TTTATTACCT ATTTGATTAG CGGTATG CCA ACCTACTTGT GATTCATCTA AGGCTTGCCA 3 060 

AACTGTGTTG CCTGATACGT AACTATGCGC AATGCCCGCT TCTAATCTTG ATAAAGGTGC 3120 

ATTTACTAAT CCGTTACGAT ATGCTTCAGC AGTCGCCCCT TTGCTCCCTG CGTCGTTGTG 3180 

TATAACTATA CCTTTAGGGT TACTACCACG CTTAGGTAGG TCATAACCTT TAACCACATC 3240 

TTTGATGATT TTAAGTTCTA CTGCTTTAGG TTGTGGCTTA GCTGTTTCTT TTTTAGGTGC 33 00 

TTGTGTAGGA GATTGAACTG ATCGTGGCGC TGTCTCACTT TTAAAATTCG GACGGATAAA 3360 

CCACATAGGG AAATCATAAG CATGTTGTCG TCTTGTAACT TTTTCCCAAC CCCAGCCGGG 34 20 

TTGTTCGATT CCGTCAGTCC AGCCACCGCC TAGCCAATTC TGCTCATATA CAATGATGTA 34 80 

ATCTAAAGTT GCTTCAATTA CCCATGCAAC GTGACCATAT CCAGCACCGT AGTTGCTACC 3540 

GAATACCACC ATGTCGCCAG GTTGTGCTAA GAAGTCCGGT GTATTTTGGT ATACAGTAGC 3600 

TAATCCGTCG AAGTTGTTAG CGAACGGAAT ATCTTTTGCA CCTAAACCTT TTAGAAGTAA 3660 

45 TCCAAACAAA ACTTTCCAAC CAGCATTGGC ATAATCAAAG CATTGAAATC CATACCATAA 3720 

GTCCACATTG AATTGTTTTC CCTCAGAAGT TTTCAACCAC TCTATAAACT CATTTTTAGT 37 80 

TAATTTTGCT TGCATTGTCG CCACCTCCAT GATGATACTC ATTCACATCA AAGCCAACAT 3 84 0 

SO CGTTAGAGGC GTCTGTGAAA GGTTGTGATG TATCATATTC TTTTGGTGcT TTCGCGCTTA 3900 

ATTC CGGCGT TAAACTACTG TCTTGTGATG ATTTCCACGT AACTTGTTGT TCTTCTTTTT 3960 
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TTGGGTCAGT AATAACGCCA ATACCTGTAA GTAACGTGAG GATAGCGCCT ATAATTGCGC 4080 

TAGCTTGATT TAATTGAGTA GATAAATCTA ATCCGAATAA ATCCGTGACT TGCTTGATAA 4140 

5 ATAGCAACAA TGCTCCAACT AAACCAGTTA GTACTGCTTT GTTTTTGAAT CTCAATTTCC 4200 

AGTTAATATC CATTTGTTTG CTCCTTTTAT CCAAAATAAA AAAACGACTA AAAATTAGTC 4260 

GTTTAAAATT ATTCAATGGT CAATGTCGGA GATCCTGAAT AAACATCACT TATAGTGACG 4320 

10 TACAACATCC CTGAAGGATT ACTAAAGTTG ATATTTTTAC TTGCAACTCC GCTATTGACT 4380 

CCTGATATTC CTAAATCACT TGACCCTAAA TTAGTTTGCG AAATCCTCAT TATACCGCTA 444 0 

CGTACATTTT CTATTGTCAC CTGATAACTT TTATTGGGTT CAACTCCATT TATTGTCCAT 4500 

TTTGCTGTTG ATTCTTCTAT GCTATCCGGA TATTTATTTT TAGGTAAGGG TTTTATTACA 4560 

AAAGATGAAG G C TTTT T CCA TACTTGGATA TTTCCAGCAT ATACTTTTGT ATATTCTTCA 4 620 

CCTTCGTAAA TAAACTTCTT TACATTTTTA AAATTACCTT CCATAAAAAT CACCCTTTAA 468 0 

TTAAATATAA CGTATTCGGG TCTTTTTGAT ATATATAGTT ATATTCATTT TCTGTTCCTG 4 740 

TCCAAATTTT AACCGTCGGT TGAGATGCGC TTTTTAGTTG ATATAAATTA TCCGCTTGTT 4800 

GTTTAGTAAA AGCTTGAGAT GACAAAACAT ACCGCTCGTC ATGATTATGA TTTTTTGGAG 4860 

CATATAAATC ATTTAGTGTT TGTTTGAATT CCTCAAAATC TTCTGTATTA ACTTTTGAGC 4 92 0 

CAATCTGTTG CAATACACTT TCTGAAATAG AGTTGTTTTG TATTGCTTCT GCTAATTCTC 4980 

30 TTAATGTGTT CATAGATTCA GGCGCGCTAT CAACTAGTTC AGCAATTTTT GTATCCGTAT 504 0 

ACGTTTTAGA GTCGTTGAGA GTTGTATCTT TGATTTTTTC AACTTCTTGC AATTTATTTT 5100 

CTAACCCTTC AACATTTGCG ATATTGATTT TGTCCAATAA CTCAGGTTCT GCTTTGATAT 5160 

CTGTATCTTT ACCATCAATT TGCCACATTT TAGTGTCAGG ATTGATTGAT ACTACAGTAC 5220 

CGTTJTTACC GGGTGCGCCT TGTTCTCCTT TTTTACCTGC TTCACCTTTT GCTCCAGGTT 5280 

GTCCCGGTTC ACCTTTATCA CCTTTCGCAC CTTTAAATCT ACTTTCATTC TTTTCGATGT 534 0 

AAGAAATGAC ATCTTTATCT ATTTTCTCTT TAAAGTCTTT GCTCAATAAA TCTGTCGCGT 5400 

TATCTTTTAA AATTCTCGTA ATAGCATCAT CTACCAATTT AACATCGATT TCTTTTGCTA 54 60 

CAGCAGATTC AATACCACTA TCAACGATAT TGAAAGAAAA GTTTGCGACA TGTATTTTTT 552 0 

crrcrrcT i v ctctaaaaac agcttacagc gaacataacc agcgtgtttg ataacctttt sseo 

TAGGTATCTT GTAGGTAAGG AAACCTTTTA CAACATCGTC GATAATAAGG GGCTCATTTT 564 0 

TGAATATAGA GCCATCTTCC ATAAACAAAT GTAATCTAGG TGTTAAGCCA TGTGCTTTTA 5700 
GATCGATACG ACCTTGTTTG TCATTGATAC CTATTCTTAT AGATGCTGTA TTTTCATCTT 5760 
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CAACATCTTT TATTTTGTAC ATTTACACAC CTCTTTATTT ATATTTATCC CTTGTGAAGT 58 80 

AGATACCTTT TAAGCCGATT TGTTTATATA ACTTAGCGAT TGTACTTGCT TGATGTTGGC 5940 

ACCACTCTAT AGCAGTAGCG TATTGGTGGG TAGCTGGATT CTTAGGATTC CATCTAATTC 6000 

GGTACAATGT GTTTTGACCT TTATTGATGT AATCCTTTCT TACGAAGCTA GCACCGCCCA 6060 

TGATTGCTTT TGCTGGAGAT GTCCAACCTT TATTC CTTGC AAACGTCATT GCGTAGTTAG 6120 

GATTGTTGTC GTAAGCGCCA ATGCCGAAGT AGTTGTATAC TCCATCTTTT CCGTTAGCGA 6180 

AGTTACTTGT TCCATATCCA CTTTCTAAGA AAGCATGCGC GATTAAATAA ATTTCATTAA 6240 

TGTTGTGCTT TTTACAAGCT TCTGCGAACG CTTTACCTTG ATTATTCAAT GTTCCCTTAC 6300 

CTTTAAGTAT CTTATTAAGT GCGCTAACTG AAACACCTTG ATACTTGCCT AAATTAAGCA 6360 

TTTGGTAGCA TTGTGTGTTA CTTTCCCATA TACGCTTTAC ATTCATTGCT GAACTCGTTT 6420 

20 GTGCTCGTGT AGCGTTAscC AACCCCAAGC ATTAGATTTT TTCGGGTTAC CTCTTGCCAT 64 80 

TTGTTTATCC AGTGCTTGTT TGAATGTATA AGGACTCGTT TCTGTTATGA TCTGCGGTTG 654 0 

TTTAGATGCC GAACCATTGT TGGCTGTTGG TGACGAGTCT CTTACATTAG CTATATCAGC 6600 

25 GTTTTTATTA TCTACCATAA CTTTTATTCT AGATTTTGTT ACTGTTGGCT TAGTTATAGA 6660 

ATTTAATAAT TTTTCTCTGT TTTTAAATAT ATTAAGTAAT GCCTTTTCTA ATGCTTCGTA 672 0 

TTTATCTTTA GGAGGAACAC CGTTGTCAAT CATATTCCAA TTAACATGTT CCAACATTGA 6780 

ACGCCAAATG CTGTCGTCTA CTTTTAAATT TTCAATACTT AG AGGTAT CT CATATTTGGC 684 0 

CATCATATCT ACAGCTACAA CCATTGCGTG AATCT CATT A AAAATAAATT CATTTTTACT 6 90 0 

CGCACTATAA TCTTCACATA CGTCTATAAC TATATAATCA GGTTCATTAG GAACTTCAAA 6960 

TACAGCTCTT CTAGGTG CCC AAATATTATG TCTATCAACA TAAAAGTGGG GATATTCTAC 7020 

ATCCTGTTTG TATTTCTTCC TACTGTTATA TAAACTTTCT ACCGAGCTCA TCGTTTGTGC 7080 

GTTTCTAATC ATTATTCCTT TAGGTTTTTC GAGTCGTCGA TTACCTTCTA CTATAAAGTG 714 0 

AT AAATAT AT TCTGGATAAT TAACCTCTTG GCTAGAAATA GTGTACTTTA TAGTTGTTAC 720 0 

ATCTTTCCAA ATTGGAACTT TTTTATTATT TTTTTCGTTA TCATCACTAT CATCTTCTGG 7260 

TTTAGGTGCC GGTGTAGTTT TGTCTGGATG ATATGGTGGT CTAACAAAAT ATTTAACCCC 732 0 

TCCACCTGGT CCATCATGAT AAGAGTGTTT AATTTTATAA GGTGGACTTC CTGTTGCGTT 73 80 

ATTTGTATAC CAGTTTTGAT CTACGCCATA CCAATAGTCT TTTGTGCATG GTCCCACTAC 744 0 

SO AATGTTTACA TGTCCTGCCC AACCACCAGT CCAAACACCC CAGTCGCCTG GTTGTGGTAC 7500 

AAAATCTTTT GTATTTCTAA TTATCTTGAA ATCTCTACCT CTATAATTGG ATTTTTGAGC 7560 
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TAAATCCCAG CATTGTGCTC CCATTCCAGA ACCAGGTACA TCAATAGCTA TTTTGTTTTT 
AGCGATATAT AACGCCCATT CAACCACTTC ACTAGCTGTG GGCTTTCTAT TTTTCGGATT 
AGGTAATCCC ATGTATGCAC CTCATTTCAA TCAAAATAAA AAGCCAGTGC CGAAGCACTG 
ACTCTTAACT GTTATTTACA TTTACCAAAC CAGAAGCACG CCCAGAAGCT ATATCCTAAA 
ATCCCTTTAA GCATGGTAAT CACCTCCTTT AAATACCAAA AACAGTTCTT AGTAAAGCTA 
TGACAATCGT ACTGAAGATA GTCCCTATCA AACCTAGAAT CCACATTTTT ATGTCTCTAA 
TATTCTTGGC ATTCTTTTCT TTATTCTTTT CAT CTTCT AC CTTGTCGCGC TTTAATTCTT 
CAAAATTTCT ATCTAATTTG TCATAAATCT TTTCTTGCGC TCTAAGACTA T CTTCT ATT C 
TGTCGAATTT TTCAAACATA GTCTTATCAT TTTCTTCTAA TCGCGTTAAA CGCCAATCTT 
GTTCATGTCG TTTGGTAAAT CCAAACATTA TGCCACCCAC TTTATTCAAA TTAAAAAGCC 
ACAAGCATTA CACCTGTGAC TTTTCATCTT TTGTTTCTGG ATATTTTTCT CCAGTGATTA 
AAGCGTATTC TTCTTTATCG ATTAAACCCT TGTCTACGTA CCACTTAATT TGCTCGTTTT 
TATAGTAACC CCAAACATAA AAAGTTTTAA TGTCTTTAAA AGTTGGATAA ATCAT CTTCA 
TTATTTAAAC GTCCCCCTCA GTACTTGTTT TGTTAGTTTT CAGTT CAGTC AACTGTTGTG 
TTAACATAGC GTTTTGTTGA GCTAATTCCA TTGTTAATAC GTTTACTTGT GCCACCTGCA 
TTTGCATACT CGCAACCATT CCGCGAAGTT CCTCATCACT TAAATCTGAC GCACTTTGTT 
GGTTTGATGC ATTCGGTACG TCTTCTTTTT CGAAATTGCT ATTGTATTTA ATTTCGCCGT 
TAGTGAAAAC AAACTTTCTA GGTTCGAACT CTTCTTTAAA TTTAATAGGC ACATTGTTAT 
CATCTACATC TAAACTATTG CGTAAACCGC CAGTATTAAC GAATCCGATA ACTTCGTTTT 
TATCGTTTAC TGTGATTTTC ATTATTTCCA CCCCATAATT TTAGTTATAG TAACTTTGTT 
GGCATTCGCT CCAGAACCTG ATGTTTTACC TAAATCAAAG TACACATCGT TATCTATTCT 
TAAAGTAGTG CTACTTGTTT TGGATAGTAA GCACTCATAA ATACCGCCAC CGTTGCCGTC 
TGAGTCAACT ACATTCGCTT TACTCAATTG AATCGCGTTA GGTAATGCGG TTAGTCCGAA 
TCCCTCAATA ACGCCACCTG GATAAGTTCC ACTTACCAAC AAAATAGAAT AGTTTGTGTA 
CGGTTCAGTT AGATTGATTG TTGTACCTAC ACCATTTGCG CCACCGTCGA ACAATACCGT 
TGATTTATGT TCATTAGGAA CTGTCCACTG TTGCTCAAGT CTGCCGTTTG TGATTGATCG 
TGTGTAAATC TTTTTAGAGT TATAAGGTGT GAAGTTAAAT AGCTTGTTTG TATCATCTTT 
AACGAATACC GATAAATAAC CCTCATAACT TTCAACG CTA CCTGGTAAAT CCGGCACTCT 
TGTTGCATAG TAATTACCAG CAGTTAAATA TCCCAAATCG CCTTGCGCAT TATTTAAGTT 
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GAATTTATCA TCTACATACT GCTTAGCTTG ATTTAAAGCG TTGTTAGACG TTTCTTCAAC 94 80 

AAATTGCTTA GTTAAGTTTC CATCATTCTT TTTATAAAAC GGGTACCATG TGCCGTAGAT 954 0 

TTTGTATTTT GTGTACTCAT CGTTTGAATC GTCTGGGTAC CATGTTGCAC GAG CAG T ATT 9600 

ATT AT CAACA ACATAAACAA CTAACACACC AGATTTGCTT GATGTATAAG TTGATTCATC 9660 

GAACGAAGAA CCGTCATCAA CACCATCTTG TCCAGGCTTC TCTAACGTGC CTATATCCGT 9720 

CTTTTCTGGC GCATCTGTTG CATTAGTAAT ATGAATAATC CTAGATGTGT TAACTGCGCT 9780 

TAAAACGCTA TCTATGGACT GCTCATACGA TTCAATTGCT TTACCGTAAT CATCTGTAAG 984 0 

TTTAGACTTT TGCCAATTCG TTGTTGAATT ACCTTTAACA AGGTCAGCGC CATTGATTTG 9900 

TTGTTCAACT TCGTTAACAC GTTCAAAAAT CGCTTGCTCT TTTTCAACTA TTTTATCGAA 9960 

TTCAGCTGTA ACAGCTTGTG TTG CACTAGT TTGCGTCGCA GTAATAGCTT GTATAGCTTC 10020 

GTTTTGCTTG ATTTCGATTT GTTGAATGCC TTTTGTCGCA CTATCATTCA CTTTTGCTAT 10080 

TAACGTTTGT GTATCAGCCA TATTTTGCTT TAATTGGTTA AAATCCTTAC CGACAGCTTC 1014 0 

GATAGTATCT TGAATAGATT TGATATAAAC AAGCTTTGTT ATACCATCAA ACCCACTAAC 10200 

25 TAAATCATTT TCAATATTGA AG CT AAA TTG ACGTT CAACA ACAACATTAT TACTCCCGTT 10260 

TTGTGTAAAG AATG CCTGAG CATGCACCTT GCCTGAATGT TTTAAAAATT CATTCGGTAT 10320 

CACATACTGC AAACGCCCAT TAATTGCGTC TACTATCGTT AATTCGTCTG AAATATAAGC 103 80 

GC CTCTATCT ACGTTATAAT CATCGGTTTT TAAnACGATA GATGTTTTAA CATGTTCAGA 10440 

ACTTATAGAT AAGGGT CTGT TATnCTTAGT 10470 
(2) INFORMATION FOR SEQ ID NO: 21: 
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(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3647 base pairs 

(B) TYPE : nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 

ATCAGATCTT GAGAATCGAG TTATTAAGTC TATCGAAGAC TTAACTAAAA TCCAACCATT 60 

CATGCCTACA CAAGATTTTG ATTTTAAAAC TAAAGAAATT CAATCAAACA TTTCTGAAGA 120 

AAGATTTATC GAAATGATTC AGTATTTCAA AGAGAAAATA ACAGAAGGGG ATATGTTCCA 180 

so AGTTGTGCCA TCAAGAATTT ACAAATATGC GcATCATGCT AGTCAGCATT TAAATCAACT 24 0 

TTCGTTTCAA CTGTATCAAA ATTTAAAACG ACAAAACCCA AGTCCATATA TGTATTATCT 300 
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TCAAATTGTA 


ACAACTAATC 


CTATTGCAGG 


TACGATTCAA 


CGTGGTGAGA 


CGACACAAAT 


420 




AGATAATGAG 


AATATGAAAC 


AACTACTTAA 


TGATCCAAAA 


GAATGCAGCG 


AACATCGTAT 


4B0 


5 


GCTAGTTGAT 


TTAGGACGTA 


ATGATATTCA 


TAGAGTAAGT 


AAAATCGGTA 


CCTCAAAAAT 


540 




TACTAAATTA 


ATGGTTATTG 


AAAAATATGA 


ACATGTTATG 


CATATCGTAA 


GTGAAGTCAC 


600 




AGGTAAAATA 


AATCAAAATT 


TATCGCCAAT 


GACAGTTATT 


GCGAATTTAT 


TACCAACAGG 


660 


10 


TACCGTTTCA GGTGCACCAA 


AATTACGTGC 


AATTGAAAGA 


ATATATGAAC 


AATATCCACA 


720 




TAAACGGGGC 


GTTTATAGTG 


GTGGTGTTGG 


ATACATAAAT 


TGTAATCATA 


ACTTAGATTT 


780 


15 


TGCATTAGCA 


ATTCGAACGA 


TGATGATAGA 


TGAGCAGTAT 


ATCAACGTAG 


AAGCTGGTTG 


840 


TGGCGTTGTA 


TATGATTCTA 


TTCCTGAAAA 


AGAACTGAAT 


GAAACGAAAT 


TGAAAGCTAA 


900 




AAG CTTATTG 


GAGGTGAGCC 


CATGATCTTA 


GTTGTAGATA 


ATTATGATTC 


CTTTACATAT 


960 


20 


AACCTAGTGG 


ATATTGTTGC 


T CAACAT ACT 


GACGTCATTG 


TTCAATACCC 


TGATGATGAT 


1020 




AATGTGCTGA 


ATCAATCGGT 


GGACGCTGTT 


ATTATATCTC 


CTGGTCCAGG 


GCATCCATTA 


1080 




GACGATCAAC 


AGTTAATGAA 


AATCATATCA 


ACCTATCAAC 


ACAAACCCAT 


TTTAGGTATT 


1140 


25 


TGTTTAGGGG 


CTCAGGCACT 


GACTTGTTAC 


TACGGTGGAG 


AAGTCATTAA AGGCGACAAG 


1200 




GTTATGCACG 


GCAAAGTTGA 


TACACTAAAG 


GTTATATCGC 


ATCATCAACA 


TCTGTTATAT 


1260 




CAAGATATAC 


CAGAACAGTT 


TTCAATTATG 


AGATATCATT 


CATTAATAAG 


TAACCCTGAC 


1320 


30 


AATTTTCCAG 


AAGAATTGAA 


AATTACTGGA 


CGTACCAAAG 


ATTGTATACA 


GTCATTCGAG 


1380 




CATAAAGAAA 


GACCGCATTA 


TGGTATTCAG 


TACCATCCTG 


AATCATTTGC 


TACAGACTAT 


1440 




GGTGTCAAAA 


TAATTACAAA 


TTTCATTAAT 


CTAGTGAAGG 


AAGGATGAAA 


ACCATGACAT 


1500 


35 


TACTAACAAG 


AATAAAAACT 


GAAACTATAT 


TACTTGAAAG 


CGACATTAAA 


GAGCTAATCG 


1560 




atatScttat 


TTCTCCTAGT 


ATTGGAACTG 


ATATTAAATA 


TGAATTACTT 


AGTTCCTATT 


1620 




CGGAGCGAGA 


AATCCAACAA 


CAAGAATTAA 


CATATATTGT 


ACGTAGCTTA 


ATTAATACAA 


1680 


40 


TGTATCCACA 


TCAACCATGT 


TATGAAGGGG 


CTATGTGTGT 


GTGCGGCACA 


GGTGGTGACA 


1740 




AGTCAAATAG 


TTTCAACATT 


TCAACGACTG 


TTGCTTTTGT 


TGTAGCAAGT 


GCTGGcGTAA 


1800 


45 


AAGTTATAAA 


ACATGGtAAT 


AAAAGTATTA 


CCTCaAATTC 


aGGTAGTACG 


GATTTGtTAA 


1860 


ATCAAATGAA 


CATACAAaCA 


ACAACTGTTG 


ATGATACACC 


T AAC CAATTA 


AATGAnAAAG 


1920 




ACCTTGTATT 


CATTGGTGCA 


aCTGAATCAT 


ATCCAATCAT 


GAAGTATATG 


CAACCAGTTA 


1980 


SO 


GAAAAATGAT 


TGGAAAGCCT 


ACAATATTAA 


. ACCTTGTGGG 


TCCATTAATT 


1 AATCCATATC 


2040 




ACTTAACGTA 


TCAAATGGTA 


, GGCGTCTTTG 


ATCCTACAAA GTTAAAGTTA GTTGCTAAAA 


2100 
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AAGCAACACT ATCTGGTGAT AATTTGATAT ATGAATTGAC TGAAGATGGA GAAATCAAAA 2220 

ATTACACATT AAATGCGACT GATTATGCTT TGAAACATGC GCCGAATAGT GATTTTAAAG 2280 

GCGGTTCACC TGAAGAAAAT TTAGCAATCT CCCTTAATAT CTTGAATGGT AAAGATCAGT 2340 

CAAGTCGACG TGATGTTGTC TTACTAAATG CGGGTTTAAG CCTTTATGTT GCAGAGAAAr 24 00 

TGGATACCAT CGCAGAAGGC ATAGAACTTG CAACTACATT GATTGATAAT GGTGAAG CAT 24 60 

TGGAAAAATA CCATCAAATG AGAGGTGAAT AATATGACGA TTTTATCAGA AATTGTTAAA 2520 

TATAAACAGT CACTTTTACA AAATGGCTAT TATCAAGACA AACTTAATAC CTTGAAAAGT 2 58 0 

GTGAAGATTC AGAATAAAAA ATCTTTTATA AACGCAATTG AGAAAGAACC AAAGCTAGCA 264 0 

ATTATTGCAG AAATTAAATC GAAGAGTCCT ACAGTTAATG ACTTACCTGA ACGAGATTTA 2700 

TCGCAACAAA TCTCAGATTA TGAC CAAT AT GGTGCAAATG CCGTGTCCAT TTTAACTGAT 2 760 

GAAAAGTACT TTGGTGGTAG TTTTGAAAGA TTACAAGCAT TGACGACAAA AACAACATTA 2 820 

CCCGTATTAT GCAAAGACTT TATTATAGAC CCGCTTCAAA TTG ATGTTG C TAAACAAGGT 2 6 80 

GGTGCATCTA TGATTTTATT GATCGTTAAC ATCTTATCTG ATAAACAATT GAAAGATTTA 2 940 

25 TATAACTACG CTATATCGCA AAATCTAGAA GTGTTAGTTG AAGTACATGA TCGCCATGAA 3 000 

TTAGAACGTG CCTATAAGGT TAATGCTAAA TTGATTGGTG TAAATAACAG GGACTTAAAA 3 060 

CGATTTGTTA CAAATGTGGA ACATACAAAT ACTATTTTAG AAAATAAAAA AACAAATCAT 3120 

TATTATATTT CTGAAAGTGG T ATT CACGAT GCATCTGATG TAAGAAAAAT CTTGCATAGT 3180 

GGTATCGATG GCTTACTAAT AGGTGAGGCG CTTATGCGTT GTGACAATCT ATCTGAATTT 3240 

TTACCACAAC TGAAAATGCA AAAGGTGAAG TCATGATGAA ATTGAAATTT TGTGGCTTTA 3300 

CATCAATAAA GGATGTTACA GCGGCCAGTC AATTACCTAT TGATGCGATA GGTTTCATCC 3360 

ATTATGAAAA AAGTAAAAGG CATCAAACAA TTACCCAAAT AAAAAAGTTA GCGTCTGCTG 3420 

TTCCAAATCA TATCGATAAA GTATGTGTCA TGGTAAATCC TGATTTAACA ACAATTGAAC 34 80 

ACGTATTAAG CAATACGTCA ATTAACACAA TACAGTTACA CgGCACAGAA TCTATTGATT 3 540 

TTATACAGGA AATTAAAAAG AAATATTCAA GCATTAAAAT CACTAAAGCT TTAGCTGCaG 3600 

ATGgAAAACm TwATCCCAAA caTtAAtnAA tnTTAgGGGG TCCGTGG 3647 

(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5966 base pairs 
so (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 





CcAcCTTGAC 


CACCTTTACG 


TGGAATCTTT 


TCmCCTkGAG 


CAACaTCGaT AATaTATATT 


60 


5 


GAAAgTCAAC 


AAGTTCTGGA 


CTAAATGTTG 


CTGCTAAGTT 


ATCGCCACCA GATTCTATGA 


120 




AAATTAGTTC 


TATATCGTCA 


TGACGTTCTA 


ATAATTCGTC 


T ATTGCTG CA AAGTTCATAG 


180 


70 


„ ATGCATCTTC 


ACGAATCGCA 


GTATGAGGAC 


ATCCACCAGT 


TTCAACACCA ATGATACGAC 


240 


TTTCAGGTAG 


AACTCCTGAA 


TTTACTAATA 


TCTTTTCGTC 


TTCTTTTGTA TATATATCAT 


300 




TTGTAATAAC 


GCCGATACTC 


ATTTCTTTTG 


AAAGACGTTT 


TACAACTTTT TCAATTAATT 


360 


1S 


GTGTTTTACC 


TGCACCTACA 


GGACCACCAA 


TACCAATTTT 


AATCGGATTT GCCACAATTA 


420 




TAACCTCCTA 


TGATATGAAA 


t TCTAACATT 


GaCGTTCTCA 


TGCGCCATTT GATTTAGTTC 


480 




TAAACCAGGC 


GCTGTCATGC 


CAAAATCTGC 


TTCTTTTAAT 


TCGAAAATCT GCTTTCTTGT 


540 


20 


TCCTTCTATA 


TAAGGAATCA 


TGTGAGTAAC 


TATCTTTTGA 


CCAGCAGTTT GTCCAAGTGG 


€00 




AATAGCACGA 


ACAGCATTTT 


GAGTTAAACT 


TGAAACATTT 


TGATATAAAT AGTAATCAAT 


660 




AATCGTTTCA 


ATATCTACAC 


CTAAATGATG 


GCCTAGCATA 


GTAAAACAAA TAGCTGGATT 


720 


25 


TnACTTTGCT 


TTCTTATCTT 


GCATTTGTTG 


ATGATACCAA 


GCAATCCATG GGCTATtATA 


780 




AAGTTCTAAA 


GCCAATTTAA 


CCATGCGAGT 


CCCCATTTGT 


kTTGCACCAA CACGTGTTTC 


340 


30 


TTTAGGTAAG 


TTTTGrACAr 


ACATCAGTTT 


ATCTATGTGT 


AATACTTTTT GTGTATCATC 


900 


ATTTTCCAAT 


GCATCATAAA 


CTAaACGCAT 


GGCTAAACCA 


TCAGAATAGG TAAGTTGCTC 


960 




TTGTAAAAAC 


atttttaacc 


AAGCAATAAA 


AGTATGATCG 


TCATGAATTA TATTTCGTTG 


1020 


35 


AATATATGTT 


TCAAGACCAA 


ATGAATGACT 


GAAAGCACCT 


GTTGGAAACT GTGAATCACA 


1080 




GAACTGAAAT 


AATCTTAAGT 


GTGTATGATC 


AATCATGAGA 


ATGCCCTATA TGTCTGAAAG 


1140 




CCTTATTAAC 


TTTACGGTCT 


TCTCGAACAT 


ATGGGATGCC 


TAAACTTTTT AATAAATCTT 


1200 


40 


CAACTAAATA 


ATCATATTGT 


ACTAGCATTT 


CAGTCTCTGT 


AAATTGTGCT GGCAAATGAC 


1260 




GATTTCCTAA 


TTGATGGGCT 


ATATCTCCCA 


TTTCTTGCAA 


TGTTCTTGGT TGAATCACTA 


1320 


45 


AAAGATCTTC TGAATTAACA 


TCCACAATAA 


TCATATTATG 


GTCATCTGCG TATAAAATAT 


1380 


CTCCATATTG 


TAAGTCAATA 


GGTTGTTTTA 


AACGAATGCC 


TATTTCAGTG CCATGGTCTG 


1440 




TAACGACTCT 


TTGAATACGT 


TTAACAAGAT 


CTGAATTTTC 


AAGGTATACT TTTTCGACGT 


1500 


SO 


GCTTTTGTTT 


TTCTGAATTT 


GACAAATTGG 


CAATATTGCC 


: TTGGATTTCT TCAACAATCA 


1560 




TTCTATGTTC 


CTCCTAGAAT 


AAGAAGTATC 


1 TTTGAGTTAA TGGTAACTCA GTTGCTGCAT 


1620 




TACTTGTAAT 


TTTTTCTCCA T CT ACAT AT A CTTCATATGT TTGTGGATCA ACGT CTAATT 


1680 
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GACG CAC CAT GCGTTTTAAA TTTAATGCAC GATTGATACC ATTTTCATAA GCAGTTTTAG 1800 

ACACGAATGT CATTGACGTA CTTGTAAGGT TTCCGCCGTA TTGACCATAC ATTTTACGGT I86 0 

5 

ACTTCATCGG TTCAGATGTA GGTATAGAAC CATTTGCATC GCCATTTACG GCAGAGTTAA 1920 

TTAATCCGCC CTTTACAACT AATTCAGGTT TAACCCCAAA GAAAATTGGG TCCCATAAGA 198 0 

CAATGTCAGC TAGTTTGCCC GGCTCGATAG ATC CTACAT A TTCAGAAATA CCATGTGTAA 204 0 

TTG CTGGGTT AATTGTATAT TTAGCGATAT AACGTTTGAT GCGATTATTA TCATTATGTT 210 0 

CAAAATCACC ATCTAAAGGA CCACGTTGTT CTTTCATGCG ATGTGCTACT TGCCATGTTC 216 0 

75 GTGTAATTAC TTCACCTACA CGGCCCATTG CTTGTGAATC GGAACTAATC ATACTGAATA 2220 

CACCCATATC TTGCAGAACA TCTTCTGCTG CAATCGTTTC TTTACGAATA CGTGAATCTG 2280 

CGAATGCGAT ATCTTCAGGA ATAGCCGCAT TTAAATGGTG AGTAATCATT AC CAT ATCTA 234 0 

20 

AATGTTCATC TACAGTATTA TGTGTATAAG GCAAAGTTGG ATTTGTAGAT GAAGGTAAAA 2400 

TATTTGAAAA TGCAGCGGAT TTAATTAAAT CAGGCGCATG ACCGCCACCA GCACCTTCAG 2460 

TATGGTACAT ATGAAGTACA CGGTCTTTAA CAGCAGCCAT TGTGTCTTCC ATAAATCCTG 2520 

25 

CTTCATTTAA AGTATCTGCA TGTAATGCAA TTTGAACATC AAATTCATCA GCAACATCTA 2580 

ATGCATGACT CAAAGCAGAT GGTGTTGCAC CCCAGTCTTC ATGTACTTTT AATCCAATTG 264 0 

30 CTCCGGCATT GATTTGTTCA ATGAGTGCAG TTGGATTTGT TGCTTGTCCT TTACCTGTAA 2700 

AACCGACATT AATCGGTAAA CcTTCGGCAG CTTCTAACAT TCTATGAATA TGCCATGGAC 2760 

CTGGAGTTAC AGTTGTTGCT TTAGAACCTT CTGAAGCACC AGTACCACCA CCAATATGAG 2820 

35 TCGTAATACC ACTTTCTAAT GCGACCTCTG CTTGTTCAGG ATTAATAAAA TGAACATGAG 2880 

TATCAATACC ACCAGCAGTG ACGATTTTAC CTTCAGCGGC AATGATATCT GTTGTTGAAC 294 0 

CTATAATAAT GTCGACATTA TCCATTATAT CTGGGTTGCC GGCATTACCT ATGGCGAAAA 300 0 

40 

TATAACCATT TTTAATGCCT ATATCAGCTT TAACCACTTT ATCGTAATCG ATAATAACGG 3060 

CATTAGAAAT GACAAGGTCT GCAACGTTCA CGTCATCACG TGTTACACGA GGATTTTGCG 3120 

CCATACCGTC TCTAATAGAT TTACCACCAC CAAAAGTAGC TTCTTCACCA TAAACCGCAT 3180 

45 

AGTCTTTTTC TATTTGAGCA AATAGATTCG TATCACCTAA ACGAATGGAA TCTCCAACAG 324 0 

TTGGACCGTA TAAGCTCGTA TATTGATTTT GCGTCATTTT AAAGCTCATG ATCTTTTTCC 33 0 0 

so TCCTTTTTTA TTCACGTTTT CAGCACCGTT ATCTCCGAAT ACACCTGCAT ATTCATCATT 336 0 

TTCATCAGTT GGGCGATAGA CACGTGACTC ATCGATAGGA CCATTGACCA TACCACGAAA 34 20 

ACCAAAAATT TTACGTTTGC CAGCAT ATT C AACTAATTGA ACTTCTTTTT TATCCCCAGG 34 80 
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TTCGAAATCT 


AATGCTGCAT 


TTGCTTCATA 


AAAATGAAAA 


TGTGAGCCCA 


CTTGAATTGG 


3600 




TCGATCTCCT 


GTATTTTCAA 


CTTCGATAAC 


TGTTTCAGGA 


TGATGGTTAT 


TAATTTCAAC 


3660 


5 


CTCTGTACTT 


TTTGTAATAA 


TTTCTCCTGG 


TATCATTTGA 


CTGCCTCCTT 


TAAACAATAG 


3720 




GGTGATGTAC 


TGTGATTAAC 


TTAGTACCAT 


CGGGGAACGT 


AGCCTCGATT 


TCGATATCTG 


3780 


10 


TAATCATGTG 


TTCGACACCA 


TCCATGACAT 


CTTCTTTGTT 


TAGAATTTGT 


CTACCATAAC 


3840 




TCATTAACTC 


TGCAACGGTC 


TTACCATCGC 


GTGCACCTTC 


TAATAATTCA 


TCGCTGATTA 


3900 




AAGCTAATGC 


CTCAGGATGA 


TTTAGTTTCA 


AACCACGTGC 


TTTACGACGA 


CGTGCAACTT 


3960 


15 


CCGCCGCCAC 


TACAATCATT 


AATTTGTCTT 


GCTCTCGTTG 


TGTAAAATGC 


AAATTAAAAC 


4020 




CCCCAATTTC 


ATATTAGATA 


CaATTTACAA 


AATTTATATT 


AATCCTAATT 


GTTGTGATAA 


4080 




ACAAGTAATA 


TACAAAGTTC 


AATGTGTAAT 


TAGAAAATTA 


TATTTTTAGC 


ATATCCGATA 


4140 


20 


TTGAAGCAAA 


CAATCTAATC 


GAAAACAAAT 


AGTGGAATAT 


ATTTATGTAA 


AAACCAAAAT 


4200 




AGTTTTTAAT 


ATAACTTTTC 


ATAGAATAGT 


AGTATATTAA 


TGAGTAATGA 


TTCAAAGGAA 


4260 


25 


AGGTGAAAGA 


TTTGAAGATA 


ATAGATGTGC 


TTTTGAAAAA 


TATATCTCAG 


GTTGTGTTAA 


4320 


TTAGTAATAA 


ATGGACAGGA 


TTATTTATCT 


TAATAGGATT 


ATTTGTAGCC 


GATTGGACAA 


4380 




TTGGATTAGC 


GGCTATTGTA 


GGTAGCATCA 


TCGCCTATAC 


TTTTGCGCGT 


TTTATAAATT 


4440 


30 


ATAGTGAGGC 


AGAGATTAAT 


GATGGGTTAG 


CTGGATTTAA 


TCCAGTGCTA 


ACTGCCATTG 


4500 




CGTTAACAA7 


CTTTTTAGAT 


AAGTCAGGAT 


TAGATATTGT 


TATAACAATG 


ATAGCAACTT 


4 560 




TATTAACGTT 


ACCAGTTGCT 


GCTGCAGTGA 


GAGAAGTTTT 


AAGACCATAT 


AAAGTTCCGA 


4620 


35 


TGCTGACGAT 


GCCTTTTGTC 


ATTGTGACTT 


GGTTTACAAT 


TTTACTTTCA 


GGACAGGTTA 


4680 




AATTTGTAGA 


TACATCGTTA 


AAGTTAATGC 


CTCAAAACAT 


TGAAACGGTT 


AATTTTAGCA 


4740 




ACAATGATAG 


AATaCATTTC 


ATTCAGTCAT 


TATTTGAAGG 


ATTCAGTCAA 


GTATTTATCG 


4800 


40 


AAGCGAGTGT 


AATTGGTGGC 


GTATGTATTT 


TAATCGGCAT 


ATTGATAGCA 


TCAAGAAAAG 


4860 




CAACACTCTT 


AGCTGTTATA 


GCTAGTTTGT 


TAAGCTTTAT 


CATTGTAGCT 


CTATTAGGTG 


4920 


45 


GTAATTATGA 


TGATATTAAT 


CAGGGATTAT 


TCGGTTATAA 


CTTTGTATTA 


ATGGCAATCG 


4980 


CACTAGGATA 


TACATTTAAA 


ACAGCGATTA 


ACCCTTATAT 


TTCGACTTTT 


TTAGGTGTGT 


5040 




TATTAACAGT 


AGTGGTGCAA 


CTAGGTACAA 


CAACATTGCT 


TGAACCGTTT 


GGCTTACCTG 


5100 


50 


CATTAACATT 


G CCATTT ATT 


ATCGTGACAT 


GGATTTTATT 


ATTTGCTGGT 


ATTAAACATG 


5160 




ACAAAGTAGA 


TGCTTGATAG 


TTAAATCAAA 


CCTAATATTG 


; TTTGAATATC 


: ACCTTAAACT 


5220 




ATACAGCGAA 


TTGTATAGTT 


TAAGGTGTAT 


TTTTATGGAT AAAATTAAGT 


' GCATACTTAA 


5280 
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GTGTTAAACT 


AGGAATAAAT 


AATTTATATT 


GTGTGTTGTG 


TGGGGTGACT 


AATATGAATG 


5400 




ATATGGATAA 


ttccttttta ataacaacgg 


AAATTCAAAG 


AAAATGGATT 


GAAAAATTCA 


5460 


5 


AAGTAATTAG 


AGATACATTT 


AAGGCTAAAG 


CTGAATATAA 


TGATCAACAT 


AGCCAATTTC 


5520 




CATATAAAAA 


TATTGAATGG 


TTAATTAAAG 


AAGGTTATGG 


AAAATTAACG 


TTACCAAAAG 


5580 


10 


CATATGGTGG 


TGAAGGTGCG 


ACCATAGAAG 


ACATGGTTAT 


TTTGCAATCA 


TTTTTAGGCG 


5640 


AACTTGATGG 


TGCCACAGCA 


TTATCTATTG 


GTTGGCATGT 


GAGTGTCGTA 


GGACAAATTT 


5700 




ATGAACAGAA 


ATTATGGTCT 


CAAGATATGT 


TGGAGCAATT 


TGCTGTTGAA 


ATTAATAATG 


5760 


75 


GTGCATTAGT 


TAATAGAGCA 


GTTAGTGAAG 


CTGAAATGGG 


TAGTCCAACA 


AGAGGGGGAA 


5820 




GACCAAGTAC 


ACATGCTGTT 


AAAGCTGATG 


ATGGGTATAT 


TTTAAATGGT 


GTGAAGACAT 


5880 




ATACATCAAT 


GAGTAAAGCA 


CTAACACATA 


TTATTGTTGC 


TGCTTATATA 


GAAGAATTAG 


5940 


20 


AAAGTGTTGG 


TTTTTTCTTA 


GTAGAC 








5966 




(2) INFORMATION FOR SEQ ID NO: 23: 








25 


(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17310 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 








30 


(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 








CTGTGTCATC 


GCGAAATAGT 


TAGGGTCATT 


CATTAATCCT 


TTTGAACGTA 


TTTCATCAAA 


60 


35 


ATATAACAAT 


TTCATTAGTA 


AAGGGGACTT 


GTTCAAACCA 


GCTATAATAC 


AAAATAGACC 


120 




TATAGTCACA 


CTGCTTATAA 


TATAAGAGGT 


AACGATCACT 


TTTTTGCTAT 


TAC CTAACTT 


180 




AAAG5TGATC 


ATCCCTAAAT 


AGAAATAAAT 


GACTACAAAT 


GCATATTTAA 


CTGTAGATGC 


240 


40 


AAGAACTTCC 


TTAAC CGTAA 


TAAATATCAA 


ATCATCAAAA 


AATaGCaAAC 


AArGCGTAAT 


300 




AATCATACGA 


TATGTATACA 


AAATAATGAm 


AAACTGTmAA 


AAATGATTTG 


CCTTTAATAA 


360 


45 


ATGGTTAGCG 


AAAAACAGTA 


AATAAACTAA 


TATTAGTAAT 


GTGATAAAGT 


CAGCTATAGA 


420 


AACATTCACA 


CCGGCAATAA 


CCGAAGATTG 


CTGAATAAAA 


ACCGCTAAAC 


CGATAAGTAA 


480 




CAATGTTAGT 


AATTTACTAT 


TGTGTTGATT 


TTCCATTATA 


AACGTCTTCC 


ACTTCTTTAA 


540 


50 


TCATTTTCTC 


CTCAGTAAAA 


CATTCTAAAT 


AACGTTTTCT 


AGATTGATTA 


CTCATTTTGA 


600 




TGTAATCACT 


GTCTATTAAA 


TATTTTTCCA 


GGACTTTAGC 


AATAGTTTCG 


GGTTGGTTGT 


660 




TCATCATACA 


TATACCATTA 


TCAGCTACTA 


ATTCTGAAAT 


ACCGCCAACA 


TGACTGGCTA 


720 
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w 



15 



25 



30 



35 



40 



45 



50 



TTATTAAAAT AAACGTATCG TATTGTGATA ATAAATGACT CGCATTAATG ACATTGCCCA 840 

AAAATGTGAC ATCATTTTCT AACCCAGCTT GTACAACTTG TTGCTGACAA TCATTTAATG 900 

TAGGTCCATC GCCTATAAAT GTAAAATGCG CATGATTACT GTTATGTAAT TTCAATATCT 960 

CTATTGCCGC GATTAGATTT TGTGGCAATT TTGGATAAGC AAATCTTGCA ATCATAACAA 1020 

ATTGATGCTT TGTCGGGGCA TTAATCTGTA AATCTTGTTT ATTAGGCAAC ATTCCAACTA 1080 

CTTCGCCAAT ATTGTTATGT GATTGGCTTT TTAG CGTTTG CTTAACAGCG GGAACATCTG 114 0 

CAATACCATT ATGTATTGTG GTTAATTTCA ATCGATTAAA TCGATATTTT AACGCTAACT 1200 

GTTTATCGAA ATCTGAAACA CAAATAATGC TAT CTGTAAT AAGTGACATT AATTTTTCGA 1260 

TAACTAAATA TAGAAATTTT TTAGCTGGTT TAACACCCTC TGTAAAAGCC CATCCATGTG 1320 

CAGTAAAAAC TATACGTGTG TCTTTCGATT TCGAAATGAa CTtCGCAATT CGTCcGACCG 1380 

TtCCAGCTTT GGAAGAATGT AAATGGATAA CATCAGGTTT AATTTTCGAG AATAACTGTG 144 0 

CTAACACTTT GACAGCTAAA ATATCTTGTT TAAAGTCAAT TGGACCTACT AAATGTT CGA 1500 

TAATAATTAC ATTAACTCTT GCATCTAGTT GTT CAAT CAT TGGTCCATGA TTGCCTACAA 1560 

TGACATAAAC ATCATTGTGT ACGCAAAAAT GGTTGGCGAG TTGAATGAGA TGTGTTTGTG 1620 

CACCACCATT GTCTGCTTTA GTAATACAAT ATATAATTTT CAACTGTTAC AAACCCCTTT 1680 

AATGCTATAC TTTCAATTTC TTAACATGGC TATCTCATCA GATGAATAGT ATTTATAGCC 174 0 

ATGCAAATCA ATGATGGCAC ATATTTCTTA ATGCCATTTG ATACTGTCTC AAGGGATTCC 1800 

TCGTTATACT GTAACAATTG GTCACAATCT TTAAAATATA ACTTTTATTT GAACTTATTA 1860 

AGTAAATTAA GACTACCTTG AGCCTTCCCC TGTAATAACA ACCATCAATG TTCTAATTGA 1920 

TATATATAGT TCCATCATTA AACTACCTTT ATGTATATAT TTCATGTCAT ATTTCAGTTT 1980 

TTGTTGCGGT GTTAAGTCAT ATCCACGTTG AATTTGCGCA AGTCCTGTTA ACCCTGGTGT 2040 

AACAAGACAT CTTTGCTCGA AACCTATCAC TTCTGAACTA AATAATTCTA CAAATTCCGG 2100 

ACGTTCCGGG CGTGGTCCAA TAAAACTCAT TTCCCCTTTA ACAACATTAA TTAGTTGTGG 2160 

TAATT CATCA ATGCGTGTTT TACGAATAAA CTTCCCGACA TTTGTTATAC GAT CATCATC 2220 

TTTATCAGCC CATTGCGCAC CGTTTTTCTC TGCGTTTTTG CACATCGAAC GTAATTTGTA 22 80 

TATTTTAATT AATTTACCCA TCTTCCCAAC TCTAACCTGA CTATAAATAG GGTTTCCTGG 2 34 0 

CGAATCTATG ACGATAGCAA TGGCGAATAT AACCATAATC GGTAAAGTTA AAAATAATAA 24 00 

AACAATGCTT AAAATTAAGT CAATCGCACG TTTAATTGGG TAATAGCTTT TTCTCACTTC 24 60 

TTCTAGTTTG TCTAATTTTC TTTGATAGGC ATAACCCTTA TTATTATGGA CAGCTTCAAT 25 2 0 
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AATTAAAGTA ATCCTTTAAA CCTGTTTCTA CTGTATATTT AGGAACAAAT CCTAATGCCT 264 0 

TTAAGTTAGA AATATCTGCA TAAGAATGCT TAATATCTCC TTTTCGTGCT TCTTTAAATT 2 700 

CATGCTCGAC TGATTTTCCA TATAATTCAC CAATAATACG ATAAAC CTCT AATAAATTAG 2760 
TAAAAGTGCC TGTACCAATG TTATAACCGT GTCCAATTGC ATCTTTGTGT TCCATAATTA * 2 820 

AGCGTACAGA TTGAACAACA TCATATACAT ATACAAAATC TCTAGTTTGC AGTCCGTCAC 2 8 80 

CAAAAAATGT AAATGGCTTG TTATG CTCAA ATGAATCGAA CATCTTTGAA ATCACACCTG 294 0 

AATATTGTGA CTTAGG AT CC TGTCTTGGCC CAAATACATT AAAAAATTTA ACAACCGCTG 3 000 

75 TTGGTATGTT ATATAACGAA CAATAATTTA ATGTCGTCCG TTCGCCGTAA TATTTATCTA 3060 

TTGCATATGG TGATAATGGT AAGATTAATG ATTGATCACT TTTAGGCAAA TCAGGAAGAT 3120 

CACCATAAAC AGCTGCTGAC GAAGCAAAGA TAAAACGTTT TATATGATTA TTATATTTTT 3180 

TAATGATTTC TAACAATCTT AATGTTGCTA CGACGTTTAT TTCTTGAGAT AAGATAGGTT 3240 

TCTCAACCGA CTCAGCAACA CTAACTAATG CTGCTAAATG AATAACATAA TCAAATTGAT 3300 

ATGTCTT CAT GATTTGTTCA ACTGCATCAT ATTCACGAAT ATCTAATTCA AACACATGAT 33 60 

CGTCAGCCAA ACTTTTAATA TTTTCTCGTT TACCTGTTCT ATAGTTATCT AGAACATAAA 3420 

CATCATAATC TTGTTGTAAA TCATCTACTA AATGCGACCC AATAAAACCA GCCCCACCAG 3480 

TTATCAAAAC TCTTTCCAAA TCTTCCACCT CATTTATACA TTAAAAATAT ATCATAAAAA 3 54 0 

CATAAAGTAT TGTAAGCTTT TT AT CG AT AT TTTTTATTTA TAAAAATAAA ATGAGATAAC 3600 

TTTGTGAATT TTTATTGAGA TAAATTAGAT AGTGGTGTTT TTGTGATGTT TTATAATATC 3660 

35 TTGGGTGTGT TAATACTAAT AATGCTTTCA ACTGATGCAT TAGACTGTGA CATCATAACT 3720 

CACTTAAGAA CTTCGCTTAT TAATTTTCTA CCAATACACT CCCTTCTAAG TGCACTAAAA 3780 

AATCCTTACT GCTAAGTGAT TAAACTTAAC AATAAGGATT TATTTATCAT TAGTGGATGA 3 84 0 

TTATTAACGG AATCTCATAC CACCATCTAC AATAATTGTT TGTCCAGTAA TGTAATCAGA 3 900 

GTCTTTACCA GCTAAGAAGC TCACTACATT TGAAACATCT TCTGGTTGAG AAACTCTGCC 3 960 

CAAAGCAATC TGACTTGTAA ATTGTTCCCA ACCCCATGCT TCAGGTTTAC CTGCTTCTTC 4 020 

GGCTGTTGCC ACTGCGATAC TTTCCATCAT TGGTGTTTGA ACGATACCAG GTGCGAATGC 4 080 

ATTCACAGTA ATACCTTCAG ACGCTAAATC TTGTGCGGCT ACTTGTGTTA AACCTCGCAC 414 0 

50 TGCGAATTTT GTACTGCAAT ATAAAGACAA GCCTGGGTTA CCCTCAACGC CTGCTTGAGA 4 200 

TGTTGCATTG ATAATTTTAC CGCCATGATT GAATTTTTTA A A TT G TTCAT GTGCGGCTTG 4 260 

AATACCCCAT AGCACACCTG CAACGTTCAC GCCATATACT GTTTTAAACT GTTCTTCAGT 4320 
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GCCAAATTGC GCGGCAGTTT GTCTTAcTGC GTTAAATACA TCATCACGGT TTGATACATC 4 44 0 

TGCTTTGATA GCAATAGCTT TTGTACCATC ACTTGATAAT TTAAGTGCAG CTGCTTTTGC 4500 

CCCTTCTTCA TTGAAATCAA CAACTGCTAC TTTGAAACCA TCTTCCACTA AACGTTCTGC 4560 

AATTTTAAAA CCAATCCCTT GTGcTCCGCC AGTTACTAAT GCTACTTTGT TGTTTGTCAT 4 620 

AAAGATCACT CCTCAAATTT CTTTCCTTTA ATTACATTTT ACTCCTCTTC ATTTGAATAG 4 680 

TACAACAAAG GTAGCTCCAT TTAACAAAAT ATTCAGATAT TTAAGGTATA GTTAAACGCA 4 74 0 

CT AC C ATT AG TGATTGGCAA TGCGTTTAAA TGTCGTTTTA AAAGTTCTTA TGTTGAATAT 4 800 

TATTTTTTTA AGTCTCTCGA TTAGTTTGTC ATCAATCTTT TTTCGAGACA TGGTCTTTTG 4 860 

ATTCAATAGG CGGTTCCGTG TTATCACTGA CAACTTTAGT TGTAGCTTCA TCTTTATGTA 4 920 

TTTCTTCGTT AAATCCTTCA AGGTTTTTAG TCGTGGGATT TTTAACCTCA GGATGTTCCA 4 980 

TCATGTCTTG ACTATCAAGT TCCTTTTTAC ACGTGTCTTT ATGTGATGCT TGATTTGCGT 504 0 

TCCCTTTACT TTTTTGAATA GTGGTAGTAT CTGCTGCAGC TACTAATTTT TTTCTACCTA 5100 

AAATAGATAT GGCTGAAACA AAC CAGAGTA TTG CAGATAC AAAGTTGCAT AATACTAAAG 5160 

CGATAATAGC CAATACAATT AATATGACAC CTTTTGAAAT CCTTTCTTTA AATAAGTCAG 5220 

ATGCCAATAC GATGACAGGT ACGATTGAAA GTATAATTAC AAATATAGAA ATTATTGCCG 5280 

ATATAACTAT TGTTACTATT AAATAATCAG CTCTGCTACC TGATAATAAA TAGAAAAGGC 534 0 

CGAAAATTAG TCCATAGCAA ATTACAAACC CACATAAAGT TATAGCCATG AGTACTATAT 54 00 

AAGCTATTTG AAAATATAAA CCTATCTTTA TGAATGATTT TTCTACATTT TTTTCCATGT 5460 

35 CTATTCCCCA TTTATTTAAA ATTTATACTT TACCTTAAAT ATTCTCTTTA TTCTTTAGTG 5520 

ATTTT AT CTT TAGATTCAAA TTGATTCTCT GTACTTTCAA TATCAACTTT TTCATTTTCG 5580 
TCTGTCGATT CAT C' TT T TGA GTATTTATTC CAAATCAGCA AAATACCACC AATCAGCCAT 564 0 

40 AAAATTGACG AAAGGAAATT ATATAAACAC AGTGCAATAA TAGCATAAAC AATAAAAAGT 5700 

GCACCTCCGA TTACAGAGTA ACTTTCCATA TAAATCGCAG TAAAGATGGT TGGTAAAACA 576 0 

GTGAAAAGAG CCAATATTAA TCCTAATAAA AAAATTGTTT CGTAATCAGA TCCTCCAGCA 5820 
ATATTAATAG ATATCATCCT AACAAAAACG ACACTAAAAT ATATTTGAGC TACGATGCCT 5880 
ATCCAAATTG CTATTTTTCC TATAATTGAG CTCATACTCA TTCCCCATTT ATTTAAAATT 594 0 

TATACTTTAC CTTAATATAC CTTATTTTAT TTAATTTTTA TATGCAAAAT ACAAAAATGG 6000 
AGAACTTCAA TATTTATAAA ATATCAAAAG TTCTCCACAC TATATTGTTT TATTATATTT 606 0 

TCGCTATCAA TACGCTAAAT CATCATATTT CCCTCAACAT CACAGTAAAA CTATTGCTCC 6120 
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TTCCAATTGC GCAGTTGTTC AACATCATCA TCTTGTTTAA GTAATGCCAG TGGTACTTGA 624 0 

AGATTAAGAC ATCGTCCTGA AATATTAAAG CGTGTCACAC CTGCTGGCAC AGTTTCCCCT 6300 

TTATGAACAA CCGCTTCAAT TTCCTTATAA CTCAATGGCT GATACTTCAT GAGTACATCT 6360 

TGTTGAGAAA GACAAGGATA TGTACCTTGT GCAATTCTCT CTACAGAACA ACAACCACTA 64 20 

TAACTTGCGA CAACCTTTTC CCATACTTGA AAATGTGCTT CGCCTAAATC TTTTGTATAC 64 80 

AAATATTGTT CTGTATCACC ATGACACATT GTAATAAATG GCGCTTCTTG TCTTGTCTCA 6 54 0 

GTAGTCCATG GCAAGCGATG TTCTTGTTGT AACGTTTCCC ACCACACACC AAATGGAACT 6600 

, s TTATGTTGCC ATGTACTAAT TGAATATTGT GTTTCATGGA TTTCTTGCAC TGGAACTTTC 6660 

TTACATCCTA ACGCTTTCAA ACTTGTATAC CGATGCACAC CATCTATAAC CATATATCTA 6720 

CCATGTTGCA TCGCTGTCAC TAAAATAGGA TGACGTATAA AATCATCTGC TTCAATACTA 6780 

CTTTTCGTTT TTTCCAATCT TAAAGGTTCG AATGTTTCGT GAAGATCAAT CTTATCfACT 684 0 

GGTACCAATT TTAAATGTTC ATGAATATGA TTCAATAGTT ATTCATCCTC CTTTGTTTGT 6 900 

GTTAAATAAA TAAATTCAGG ATGTGGATGG CTTAAGAAAT CGTGATGTGA AATAGACCAT 6 960 

CCGTATGCAC CTGCATATTT GAAAACAATA ACGTCGCCTG TACTGATTGC GTCTATCTGT 7020 

ACTTCTCTAG CAAAGACATC TTTCGGTGTA CATAATTGAC CGACTAACGT TGTGTCCTGT 7080 

CTCGAAATTG AAACTTTTTC AAATGAATAT GGATTGTCCT TATAGCGATA AATGTCAAAA 714 0 

GGATGGTTAT GTTGCCAAGA TAC CGGCAGT CTAAATTGTT GCGTACCTCC TCTTAATATG 7200 

GCATACCAAG CACCATGTAC TTTCTTAATG TCTAGCACTT CTGTCACATA GTAACCAATA 7260 

35 TGTGCCACAA TAAAGCGCCC ACATTCAAAG TTCAATGTCA CATCTTCCAT TTCTTGCTCA 7320 

ACGATAAGTG TTTTAAAACG TTCTACAAAA TTATCCCATT CAAATTGGTT AGTTAAATCT 73 80 

GCATAGTTAA CGCCTATGCC ACCACCAAGA TTGATATGTT TGAGTGGAAA TCGATGTTTT 744 0 

TCAGACCATG CCTTTGCTTT TTTAAAATAA AGTTTCACTA CATCGACATG TAAATTCGAG 7500 

TCTAAATTGT TAGAAATAGA ATGAAAATGA AATCCATCTA GATGAATCTT TGGCATTGCG 7 560 

AGCGCAgcTT CAATGACATC ATCAACTTCG TCTTCAGAAA TACCAAATTG TGTTGGGCGT 762 0 

CCTGCCATAT GCAACGTTGC ATTGGGAAAT GGTCCTGCTA AATTAACACG CAATAAAATG 76 80 

TGTTGTGTCT TATCTTCATC TTCTAAGATG GCATTTAGCC GTTGTAATTC ATGCATACTT 774 0 

50 TCAACATGAA TACGCTGAAC ACCTTCACTT ACTGCATATC TTAGTTCCTC GTCTGTCTTA 7 800 

CCAGGGCCAC CAAAAATAAT ATGATTTGCT GGTTTAAAAG CAAGACCTTT TGCTATTTCA 7 860 

CCTTGAGATG CAACTTCGAA TCCTTCAACA TACTGACTAA TTGTATCTAG GATTTTTCGT 7 92 0 

55 



30 



40 



45 



8NSDOCID: <£P 078651 9 A2_L> 



287 



EP 0 786 519 A2 





TGTTGCAAAT 


GATGTTCCAG 


TCCGACTAAA 


TCATAGATAT 


AATGACAAAC 


TGGATGAGAT 


8040 




TGTGCTTTTA 


ATTGTTCAAT 


AACAGGTTGA 


ACTATACGCA 


TTAGCCTTCA 


TCCCCTTTCT 


8100 


5 


GTTTAGACGT 


CGCTAGAGAT 


GCACTTAAAT 


GGCGATATAT 


TTTTCCGCGA 


TCATCACCTA 


8160 




AAATAAATGT 


TTGTACACCT 


TGTGCCTGCC 


ATTTTGCAAT 


ATCTTCATCT 


TCACGTGGTA 


8220 


10 


ATGCACAAAA 


ATGTTTACCA 


TGTGCATTCA 


CAACTTCAAA 


AATATGTTGA 


ACATGTGATG 


8230 


TTACTTGATC 


ATCACGCGTT 


TGC CATGGTA 


TGC CAAGTG A 


CTG CGATAAA 


TCTGCGGCAC 


8340 




CTTCGACTAT 


CATGTCTAAA 


CCTTCGACTT 


GTGCTATATC 


GTCAATGGCC 


ATAACCCCTT 


8400 


IS 


CAACATCTTC 


TATCATGGCA 


ATCACCATAA 


TATGCTCATT 


AGCCATCTCC 


ATTGCATCAA 


8460 




GTAATGGTGT 


ACGTCCAAAT 


CTTGCCATGC 


GACCACCATT 


CAAACTTCTT 


AATC CTTGCG 


8520 




GGTAATAACG 


ACTTAATTTC 


ACAATATGCT 


CAACTGTCTC 


ACGATCTTTA 


ACGTGTGGCA 


8580 


20 


CAATAATACC 


TCTCGCACCC 


ATATCCAACA 


CTTTAATGAT 


ATCTCTATCT 


ATCACTGCAG 


8640 




TGACACGTAC 


AATTGGTATA 


ATATGCGCTG 


CTTCAGCTGC 


ACGAATTAAA 


TGCGCTAGTG 


8700 




TCTCATCATT 


AATCGCCACG 


TGTTCTGTAT 


CAATCACAAC 


AAAGTCATAC 
• 


CCGCTTGCTG 


8760 


Ho 


CGATAACCTC 


GATCATCAAT 


GGGTCCGGTA 


TAGAATTAAA 


AATGCCATAA 


ACTGAATCAC 


8820 




CATTGTTTAA 


TCTATGTTTC 


AGAGATAGTT 


GTTGCATCAT 


TGATACCTCC 


TACACCTAAT 


8880 


30 


GGATTTGTAA 


CATGATGAAT 


TCTTAACTCG 


GAGTCACTTA 


ATAATCGACG 


TGTCGTTAAC 


8940 




TTTTCAACTT 


GAATCGTAGG 


TTCAAACAAA 


TCGAAATGTT 


GATAGTTATT 


CAACTCTGGA 


9000 




AATGCTTCTT 


GATACGCCTC 


GATGATGCCT 


TTAACCCATT 


GCCATTGCAG 


CTCCTCATCG 


9060 


35 


ATACCATATT 


GCTTTTCAAT 


AAATAAGATG 


ATTTCGGCGA 


TATTAATAAA 


GAAAAATGCA 


9120 




TCATGTAAAA 


AGTCGCGTAC 


TAAACGTTCG 


TCATCTGTTT 


CAATAAATGA 


ATTACTATTC 


9180 




AClTl?riTAT 


GTGCTTCTGG 


CATTGGCTTT 


AATGTCAGGT 


GTGAAGCAGC 


TTCACTTAAA 


9240 


40 


TGCtCACGCT 


TAAAACGAAC 


ACCATCATGG 


AAATCTTTfA 


AGGCAATACG 


TGTAGGCCAA 


9300 




CCATTTTCftT 


G AATGAG CAT 


CATATTTTGT 


GCATGCGATT 


CAAAGGCAAT 


ACCGTGATAA 


9360 


45 


TAAAGCATAT 


GAATCATTGG 


ACGAATCGCT 


ACAGCTAAAA 


ATTGCTTTGT 


CCAAGCTTCA 


9420 


GAAC CAT ATT 


GTTTAATCCA 


ATTTTCAATG 


AATGGTACAC 


CATCCTTATC 


ACTTG CATAA 


9480 




AGTGCATTAA 


ATGGTATCGC 


ATCCTCTTCA 


TCGATTAACA 


TATGATATAT 


ATTTTCACGC 


9540 


SO 


CATATAACAC 


CTAACGCACC 


ATAAACTTGA 


GTTTGTTTAT 


AAGGCGAAAG 


TTGTGTATTT 


9600 




AAATAAGACT 


GTCCTAAGAC 


TTCCCCTAGA 


AAAACTGTCT 


TTAATTCATC 


TTTTAAATAC 


9660 




AT AT CTTGTT 


GCTGTATCTG 


CTTTAACCAA 


TCCGTAATTT 


GCGCTGCATT 


TTCAATTGTA 


9720 
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w 



15 



20 



25 



30 



35 



40 



45 



SO 



TATTTTGTCG 

TCACTTTCCC 

ATGACATGTT 

CCAGATGCTT 

AACATTTCGT 

GCTAACCACT 

AACGTAAATC 

AATT CAT AGT 

CTTTGCGTAT 

TCATTTTTAG 

TCTGCCTCAT 

TGTGTCTTTT 

ACACCGTCTT 

AGTTGGTGCA 

CTCCTTGTTA 

ATTCAATTTA 

TACACAGTCA 

CGCTACAAGT 

ATGTCCTAAG 

ATATACAACA 

CGTT5TAGAT 

TTTTAATTCA 

CAGCTTTAAT 

TAAACCACTT 

AAATAATGAC 

AACAATCATG 

ATTCGGTAAC 

CTCATCTTTG 

GTCATTCGTA 



TGTCTATTGG 

CTAACCATAG 

CAAACTGCCA 

CAATTTGCTG 

TAACTACAAC 

GCAGTTTAAC 

CTAAACGTGA 

CGTTAAATGT 

CTTTTAATTC 

GAAATGTAAA 

CTCCTACGAC 

CAGCAGTAAA 

GATATGACGC 

TCACTCTAGT 

TGACAAATTG 

CTCATCAAAT 

ACAAATACTG 

TGCCATAACA 

TGATTTACTA 

GGGCTTGATG 

AGACAAATGC 

ATTAATGTAT 

ATCGGCAATA 

TGCTCAATCA 

GCCAATACAT 

GCACTATTTG 

AATGCACGAT 

ACTGATGCGA 

CGTATAAAAT 



CGACATCGTA 

TACTGTGCCA 

TGGGTGTACA 

TACAAAATGT 

ATTTCTTGAT 

GTTTGGTACA 

TTTGTAACTT 

CTCAGGTGTT 

TGTCTGTAAT 

TACAACCTCT 

ACGCTCAATT 

ACGATACTCT 

TTTATACACA 

CTTTACACGA 

GATTTGGTAT 

TCGCTTTAGC 

CGTTATTCGC 

CAACTTCATT 

CAACGTAATA 

CTGCCACAAC 

CTTCAAGATC 

TTTGTACATG 

ATGTACGATT 

CTTGTGATAA 

GAATATCTTT 

TTAATAAATC 

ATCCTTCTTC 

TAACTTGCGC 

TAGTGATTTT 



CGAATCGATT 

TTAAGCCTTT 

GGTATCATCT 

TCATAAGTCT 

ACCGTCGTTT 

AAATCAGGAC 

GGATGATACT 

GCTGGTGGGT 

AACTCGACAA 

CTCAATAATT 

GGTGATGTGA 

GAATCATGTC 

ACAATATTCT 

TTAAGAATTG 

ATGTGTATAA 

CGCAATGGTC 

GTATTCTTTT 

TCTAGTCGCT 

TTTAAGACGA 

ATTTGGCACA 

TCTGACAAAG 

TGCTTCTAGA 

CAAATAAGAT 

CTTAGACATC 

ATCAGCATGG 

CATTTCAGGT 

AAACATCAAT 

GGCATCAATT 

AACGTGTATC 



GTTGAGGGTG 

CTTCAGCCAA 

CAACATCATT 

TATCGCCAAC 

CTACTTTATC 

GAAATTTCAA 

GATGCCCTTC 

TTGATTCTCG 

TAAATTGTTC 

GTGTATAGTC 

TACGTATACG 

CTT CTATTGT 

CATAAATAAG 

TTTGATTCAC 

ATAGGGTTTG 

GGCGTTTGAT 

TTCCAAGTCA 

TTACCAATAG 

TGCCATGCTT 

AGCTGTTTTT 

CATACGTCGG 

CTAATGCCTG 

TCAAGCCATG 

GGTGAATCAG 

TAATTCGGTA 

TCAACTGTTT 

TTAAAATGGG 

GTC CGTTCAA 

GGTAATTTTA 



ATATAGCTCA 

ATCAACTTGG 

TACATGTTTG 

TTGTTGACGT 

TTTGTCGATA 

ATT AT CACTC 

CATCGCATAA 

ATACTGCATA 

TAGCTTTTCA 

TGTTGTTGTA 

ATCAAAG CTA 

AAAATGACCG 

TGATGATACC 

AATACGATAC 

CACCACAATC 

ATAAATCTTC 

TAAGACGATG 

TTGATACTAA 

CATCATGTGC 

CAGTAGCAAT 

GTATGCCATC 

TGTTACTAAA 

CTTCTGGTGC 

GCATCGTTTC 

TCCCTTCACG 

GCCCTAATGG 

GTGTTTCAAC 

TCTGTTCAAG 

AATAAATGTT 



9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 
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GCCAAGGTCT TTTATTAAAC CTTGTTCACT ATATTGCATA TACTGTGGAT GCTGTCGCAA 11640 

CACATTGATT TGATAAGGAT GTGTTGGTAA TAAAATAAAA TCTTTGGGTA TCTCTGATAT 11700 

ATCTATGTCT GCTAATTGAT ACAACACTTT CTCAACCTGA TCTTCTTTAC CTTCTACATA 11760 

GCGCGTGAGC AGAACATCTT GATGCACAGC TAAATAATGC AATTGGAATG ATGTATGACA 11820 

TTCGGGTGCA TATTTCTCTA AATCTGCTTC TGAAAACCCA CTTGCACTCT TAGGAGTCGG 11880 

ATGAAATGGA TGACCTAAGT ATAAAGATTG TTCTGAAACG ATATAACGAT CCTCTACGTA 11940 

GTCTATTGTG TTACTTTGCA AATAACGTGC CGTGCGATGA ATGCTATTAT CGATGTCAGA 12000 

CATAATTTGC GCCATATGTT GTTGCACTGC CGTTTGATTA TCTGCACTTT GAGCCATATG 12060 

TTGCAAAATA CGCGCAATTG CTTCTTTATA AGTTGTTATT TTTTTACTTT TTCCATCGAT 12120 

AAGCCATACC TCTGGATGAT ACATATGATG CCCCATCGCA GACCAATAGC GAAATTCACC 12180 

ATATAAATTT TCTTCTCTAA AATATTCATT TAAAATGCGT TCGATAGCCG CATACGCTGC 123 00 

ATGTTGTATT AATTCTTTAT TTTGCACTTT TTTGTTTCAA CTCCCATAAT TTCATTAATG 12360 

TGTGATCGTT GATTTGATTA GTGATGGTTG AACAAATTAA AAATAAACTA CTTACTGCAA 12420 

ATACTACGCC CATAACGATA AACGTAGTAG CTGGTGTAGT ATAACTTGTA ATGGCAGCGC 12480 

cACTaAGACT GCCAATAATT TGACCAACAA CTAACATACT GTTCGTCGTT CCAACAAATG 12540 

TGCGTTTAAG TTGTTGATGA CACGCATTCA CGACAACAAA CATGACACTT TGAATCAATG 12600 

CACTATATGT TAATCCTTGA AGTATTCTTG CAGCCATTAA AAACTCTATA TTCGTCGCTA 12660 

AACCTTGCAG TATCGCACTA CAACCACATG CAATCGTGGC AAATATATAT ACTGATTTAA 12720 
CATATGATTT ATCATTAAAG CGTCCCCATA AAGGCGCGCT TAATATCGAA GCCGTCCAAA 12780 
ATGCGGACTG TAAAAATCCA ATCACACTAC GGTCATCTAT CGCTGTATGA TTCACTGATG 12840 
40 AAGCAAGTGG TGATAATGCA GTTAGCATGC CATACATAGC AAAGTTTGCT AAAACGCCAA 12900 
CGATAATAAA TCGACATGTT TGTTGTGTGC ATAATAGACA TTGAAATGAA CGGCGAATAC 1296 0 
CTTTATTAAT ATTTGGTGTT TGTGATTTTG GCATATGTGT CGTTTCAATC AATTTTAATG 13020 
CACCGAAAAT ACAGACAATA AAAGTAATAA CGGCAATACT CATCAGTAAC GCACTAAAAC 13080 
CTAATATCGA AGCTGTAACA CCGCCAATTA ATGGCCCCAC AAGAGACCCT GCGCTGACTG 13140 
AACTTTGCAG TCTTCCTAAT ACCTTTCCAC GATCTTCAGC TGGCGCCTCT GCACTCGCAA 13200 
ACGCACTTGA TGCATCAACA ACACCACCAA ATAGTCCCTG CAATAACCTC ACAAGTACAA 13 260 
ACTGTAATGG TGTCGTACAC AATGCCATTA AAAATAAGCA TACCGCCAAA CCAAGTAACG 13 320 
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ctATCATCGT CGTTACAGCT GGAGCAGCAA TCGCTATACC ACTCCACAAC TGTATTTCTA 
CGACTGATAG ATTTTGTAGT G ATG C CAT AT AAATTGGCAA TAATGGCACA AGTACTGTCA 

GTCCAGCAAT CGCTATAAAC TGACTGAGCC ATAAAATGCG AAAGTTACTG CGCCATATAG 13560 

ACTGATTAAT CATATGTCAC CATTGGATTT GGTACGGTAG TTAAACCTGA AGGCATACTA 13620 

CCTCCACCAC TATCACGTTG ATATAGCAAT GGTAATAAAA TTTGTTTGAA TGGCCACGTC 13 680 

TGTTTATCAA ATAAAATGTG TCTGACAGCT AGCTGATCAG TTGTAACCCA GGAAATAGTT 13 740 

GCCACTTCAT TTTTTAAAAT TTGTTTTAAC AACGACATAA GTTCATGCTC ACTTACACCA 13 8 00 

AATAAATCTT G AATTG CAT C AATAATGGCA TATAGATTTA CCGATACAGC TAATGTTTGA 13860 

AAATAAGCAA AGAATGTTTC CAAATCCTCA TTAATTAGCG TATTAGGTGT ATCTTCTCTG 13 920 

ACG ACATA CT TCGGCAATGA AAGCTGATGT GCTGTTAGCC ATGGTTTATA AATTCTGACA 13 980 

GTATCATGAT CACGTAACAC GCATTTTTGT ACACGTCCAT CTTCAAATGA CAACAATATA 14 040 

TTTTGACCAT GCAACTCTGG TAATGCGCCG TATTGCATAA ATGATAGTGT TACCTTTAAA 14100 

AAGACTTGCG CGATATCTTC AAATAACGTC ATGACATCAT TTTTAGAAAT ATTATCTTTT 14160 

CCACAAATCA TTTGATATAA AGTGCGATCA TTTGCCGCGA GTGCTGCCAT TGACACTAGC 14220 

TGTTG CGTAT CATTTTTGGC TAGCACTTCG GGATACTTTC TTAGCTGAAC AGTTAGATGA 14280 

30 CCTAATTGAT CTTTGAAAAT ATCATTATCT TGACCCATAT ATGACCACCA AGCTGTTTCA 14 34 0 

TCACAAACCA TGACATACTT AGCTAGTGCT TCATCTTTTT CTATAAGCTG ACGTAATAAT 144 00 

TGTTCTGCTT GTTCTCCGTT TTTCATGTAA CGCGTAGGCG TTAGCCTTAA TGCGCCTAAT 14460 

GACTGCATTG CAAATGGTAC TTTGACATGG TTATACGGTG CGCCAATATC AATTAATGAA 14520 

CGCATACTTG AAGACGACAG ATAATCTCCA AATTTTAACG GTAATAGTAC AACCAACTTT 14 580 

TCACTAATCT CTTTCGCAAA GACGTTCGGC AGAAT ATG CT GATATTCCCA AGGATGTACC 1464 0 

GGAAATAGTA CATAGTCATC TATTGATAAC CCTTGATCAT TTAACATGTC TGTCGCTTGT 14700 

TCTTTTATAG GTACTGTCAA ATTTTCTAAT TCATCGATAT TTGCAGTATC GCCATGAATC 14760 

45 ATATGTGTCT TTTTAACTGC TGCAACCATT AAAGGAAATG ATTGATTTAA TTCAGCTTGA 14 820 

TACACTTGAT AATCCGCTTC TCTTAATCCT CTTTTTTCTT TAGCTAATGG ATGAAATGGA 14 8 80 

CGATCTTTTA AACTTGCAAA CTG CTCTGAC ATCACAAAAG GATGTGACGC TAAATCTAAT 14 940 

50 TCTGATAATT GTTTAGCAAG CTGTGTGGCA GCAGTAGTCA GTCCTTCTTC AACGCGAGCC 150 00 

ACTTCCCATT CATGACTTAG ATCACAATTC ATATTAGCAA TTGTTTGCCA AAATTCAGCT 15060 

GCCGTTAAAG GTTGCTTAGA CACCCTTCCC TCTATCGTAA TTGGTTGTGA ACTTTCGTAA 15120 
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TATATCAAAA 


GCGTTTGTCC 


GTTTTCTTTA 


GTAATCTCAC 


TATTCGATAC AATTCCGGCT 


15240 




ATATCTTCAA 


ATAATAATGC 


ATCAACTAAA 


TCTCTTAATA 


TTATCGCTTG TG CTGTATTG 


15300 


5 


ACTGCTGTAT 


GATTCTGCAA 


TGTTCAGACA 


CCTCGCATTC 


TTAATATAGG TTCAATGTTG 


15360 




TCCCAATATT 


TTGTTGTTGT 


GCCTGTTGAT 


AAATAAAATA 


AGCACTTGAA ATATCTTCGA 


15420 


10 


TAGCCATACC 


CATCGGATTA 


AGTAATATGA 


TCTCATCATC 


GTCTTCACGT CCTGGTATGT 


15480 




CACCTGTCAC 


AAGTTGTCCT 


AGTTCAGCAT 


GAAGAGCTTC 


TTTGCTGAAT TTACCTTCTA 


15540 




ACACCAATTG 


GTTAATAGTT 


TTCTTTTCTC 


GATTACATTG 


TGACCAGTCA TCTACTACGA 


15600 


15 


CTTTGTCAGC 


TTTAATAAAG 


ACTTCTTTAT 


GCACATCCAT 


GATAGAAATG TTGCTAATAA 


15660 




ATGCACCCTT 


TTGTAAC CAA 


TCATATTCAA 


TGTATGGTTG 


ATCCGTTACG GTACATGTAA 


15720 




TGACTACTTC 


ACCATTTGAT 


ACTGCTTCTT 


TAGCATTTTC 


TGTCGCAATA AAATTAATTT 


15780 


20 


CCGGACGCTG 


TTGTTGCCAT 


CTATCAACAA 


AGCGTGCACA 


TG CTT CAGAG AATTGATCGT 


15840 




AAACAAACAC 


GCGTTCAATA 


TGATCGAATT 


GCTCTAACAT 


ACTTTGTAAT TGCTTGTCTC 


15900 


25 


CGATTAGCCC 


GCATCCAATG 


ATTGTTAAGT 


CTTTAAATCC 


TTTTTT AGC C AAATGCTTTG 


15960 


CTGCAATCAC 


TGAAACTGCT 


GCAGTACGCA 


TACTACTAAT 


TAAACTTGCT TCCATAACTG 


16020 




CAATTGGATA 


ATTCGTTTCT 


GGATCATTCA 


AAATAATGAC 


GCCACTTGCA CGCTCCATAT 


16080 


30 


TACGTTTCGA 


TGGATTGTCG 


TGCTTACTAC 


CTATCCACTT 


AATACCTGAA ATTGCGTGTT 


16140 




CACCACCGAT 


ATGACTTGGC 


ATTGCAATAA 


TTCGATCTGC 


GATGTGTCCA TTTTCAGGAT 


16200 




CCtGTCTTAA 


ATACGGCTTA 


AGCGGTTGTA 


CAAAATCATT 


GTGCGCATGG GCTGTTAATG 


16260 


35 


CTTCTGTTAA 


TGCGTCCACA 


TAAACTTGTG 


AATGATTACC 


TCCCGCTTGT TCAATATCTG 


16320 




ATCTATTTAA 


ATACAACATC 


TCTCTatTCa 


TTCTGaTTTA 


ACTCCTTGTC TTGATTTCAT 


16380 




TTTTTCTAAC 


CATGTATCTG 


AATAAACTAA 


ATCTAAGTAA 


CGATCGCCTC GATCTGGTAA 


16440 


40 


AATCGTGACA 


ATTGTTGCAC 


CTTCTTCAAT 


TGACGTTATC 


AACTGCTCAA TCGCTGCAAT 


16500 




AATCGAAGGT 


GTTGAAcCTC 


CGGCAAATAT 


GCCTTCATAA 


TCAATCAGTT TTCGACAGCC 


16560 


45 


CAAAGCAGAT 


TGATAATCAT 


CTACATGGAT 


CACTTGATTA 


ATTTCTGATC TATTCAATAT 


16620 


TTCGGGTACA 


CGACTAGCAC 


CGATACCAGG 


TAATTCTCTA 


TTAATAGGTT TGTCACCAAA 


16680 




AATGACTGAC 


CCTTTCGCAT 


CAACAGCAAC 


AATTTGTGCG 


TTTGGATGCA CTTCTTTTAT 


16740 


50 


TTTTCTACTC 


ATACCCATAA 


k TGCTACCTGT 


' CGTGCTGACT 


1 GGCGCGACAA AAT AAT CT AT 


16800 




AGGTTGCTTA 


ATTGTTTCAA CAATCTCTGT 


1 GCCTGCACCA TGATAATGGG ATTGCCAATT 


16860 




TAACTCATTC 


GCATATTGAT TAATCCAATA TGCATCGTCA ATAGTGGCTA ACAGTTCTTG 


16920 



55 



BNSOOCID <EP 078651 9A2 J _> 



292 



EP0 786 519 A2 



TACATTGGCA CCATAACTTT TAATAATTTT CAAATTTGTT GGTGATATTT TAGGATCAAC 17040 

AACACACGTG AGTTTTAATC CCTTGATTTT AG CTATCATT GCCAACGCAA TGCCTAAATT 17100 

5 

ACCAGAAGTA CTTTCAATTA AATGTGTATT CTCAGTGATT AAACCATGTT T AAT AC CATG 17160 

TTCAATGATG TACTTGGCAG GTCGATCTTT CATGCTGCCT CCAGGATTCA TATACTCTAA 17220 

w CTTTGCAAAC ACTTCATGTT TCGGAAATAG TTGATGAAGT TG AAC CAT AG GTGTTTGCCC 17280 

TACAGAATCT AACAATGAAT CGTGCACATG 17310 
(2) INFORMATION FOR SEQ ID NO: 24: 

is <i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 54 23 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

20 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: 

ATACTAGTAA GCGCATCGGT TATTG AC AT C GAATTCAACT TTAACAGTTT TCATGTTCGG 60 

25 

TGATGTTTCa ATAGAATGTG TGTGTT G TAC TTGCGCATTT ATATTTCCAC CTAAATTACT 120 

TAAGTTTCCT GT AAT ACT AG AAATGTCAGG TGCGTTTAAT GTAGGTTGAA ATGCATCAAC 180 

30 TACTTTATCT G CAAC ATT AG AAACATTACG GATAACTTTA CTTGAATGAT TATCTATACC 240 

TTTAACGAAA CCTAACATTG AATACATACC AACATCCATG AATTCACGTG AAGGTGAGTG 300 

AATACCTAGC GCTCTTTTGG CTGCATTTAA AGCACCTTTT GCTACACTAG CTGCTTTTTC 360 

35 AGCTAAGTCT CTAGCCATAT TACCAATACC TCTCATCAAA CCACGGATCA TATCAGCACC 420 

TGCTGATACA AAGTCATCCA CAAAGCTTTT AACTTTATTT ACTGCATTTG TCATACCTTG 480 

ACTAACTTTG TTTACAACAT TAACGAATCC TTGAACAACT CTATTAACAA rGTTAATTAG 540 

40 

CGTACtTGTt ATAGTAGATA CCCaTnGCAT ACCTTTAGTG ACmATGAAGT TCCAAGCTTG 600 

AGACATTTTG TCTGATATAG TTGAAACAAC TTGTGTGAAT ATGCTTACAA CTTTATTCCA 660 

AATTGTCGTT AATATACCAG ATAAGAAACT CCAAATCGTA TTCCATATAT TAGAAATAAA 720 

45 

ACTCCATGCC GCTTGTAACG CAGTAGATAT AGCTGTAGTG ATAGCGTTCC AAACCTTAGT 7 80 

TGCCACAGTA ACTATAGTGT TCCACAACGT TTGTAAGAAC GTCCAAATAG CGTTCCAAAT 840 

so TGTTATTGCG ATAGTCATAA TTGTGGTAAA CACTGTAGTT ATTACAGTGA CTAACAAATT 900 

CCAAATCGTA GTAGCGATTG TAATTATCGT ATTCCAGATT GTACTTAAGA ACGTCCAAAT 960 

AGCTGTCCAT ATCGTCATAA CTATTGTCAT TAT CGTCGTG AAAACAGTTG TAATGATTGT 1020 
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ATAAGCGACT ATTTGATTCC AAACAATCAT TATAAAATTG TAAACATTCG ATACTGCTGT 1140 

AGTGATAGCT GTTAAAATAG CATTCCATAC AACCGAAGCT ACAGCTTTTA ATACATTCCA 1200 

AACATTAACC ATAAACGTTT TTATCGCATT CCAAGCATTT ATAATAAAGT TTCTGAATCC 1260 

TTCATTTTTA TTCCACAATA AAACGAATAT AGCTATTAAT GCAGCAATTA CACCAATTAC 1320 

TATTGTTATT GGACCGCCTA AAATACCAAA CACAGTTACT AGTCCTGTGA TAG CATTT CT 13 80 

AATTAATCCA ATCTTACCGA ATAACAATTG GAATATAACT GATATAATTT TTAATGGTCC 144 0 

TTTTAATAAC ATGAACGCAC CTTTTAAAAT TGTTAATCCC GCTCTTAATA AACCGAACTT 1500 

1S ACTTACTAAT GCAATGrTTC T AC CT ATT AA TCCGCCACCC ATAAAGTTAG ATACAGCAAG 1560 

AATAATCGGT ATTAAAAATC TAAATGCACC AACTAAAGTT ATAATGACAC CAACTAATTG 1620 

TGCTGT AG CT GGATGCGCCT CAAACAAGTT AGCTATCCAA CCAGTTATTG CAACTGCAAC 168 0 

20 G CGTAATACT. GCACTAG CTA TAGGAGCCAT TGCTGTTGCG AATGCArmTA ATCCTCTTGC 174 0 

GATGTTTCCA ATCAATTGCA TTATTAGTGG TCCATTTGTT TGTATATAAC TGACAAAGTC 1800 

TTTAAACCCT TGAGATTGTC CTACTTGTTC AGACCATTCC CTAAACTTAG CTGTCATTTG 1860 

TTCAAGAGAT TGGAATATGC CAGTTGATGA TCCGCTGAAT GCATTCATCA AATTGTTAAT 1920 

T CCAACG AAA ACATTTTTGA AAATATTACC AATGATAGGT AAGTTTGTTT TTGTGTATTC 1980 

AATAAAACGA GTTATCGAAT TTTCTCCAGC TGCACTATTA GCCCAGTTAG AGAAAGATTG 204 0 

ACCTAATCTA TCCAACCAAT CAGCCGACCA TTGAAACAGT GGTGCTAATT GCGTGAATAC 2100 

ATTGACTAAT CCGTCACCAA AACCACCTGC AGCACTTAAT AGCTTGTTAA ATACCGAAAC 2160 

35 ACCCGTTGTA TTCATCATAT TAAAGAATCT TGAAGCTACA CTGCTATTTT CAGCC CATTT 2220 

AAGCACGCTT TGAGACG CTT CTTCCATTCC TCTTGAAATA CCACTAAAAA ACGGTTGTAA 2280 

GCTCTGCATT GCAGTTTTAA CAGTATTTAA ACCATTTGCA AGAGTTGTGA AGATAGCGGA 234 0 

TTGATTTTGC TTTATAATAT CAGTCCATGC TGACTTTACG CCATCTAACG CTTTTTTGTA 2400 

TTCGTTTGTT GCTG AG CT AG CTTGTAAAGT GCCATCATTA AGCATCTTTA TAGCGCTGAT 246 0 

AG C CATTGCG CCAAACGCTA CAAATCCTGC TCCCGCTATT GCTACGGCAC CACCTAAAGC 2520 

AAGTACACCA CCAGTTAACA CTTTGATAGC GTTTAATAGC GCAAATACTA CAGGTACTAC 2580 

GCTCG CTATT ACAGGTATTA AGATACTAAA AGATGATGTA AGTAATCCAC CAACCATATT 2 64 0 

SO AGAACCTACA GTACCGAACA CACGGAACAT ATT AG CT AAA TTCCCCATCT GTCTTTGAAA 2 700 

ATTGT CATTT GCTTTTATTA TGTAGGCATA AGCTTTCTTT AAACCATTAG TATCGACATC 2760 

TACCTTTGTT GTTTTTTTGT TCGGCAATGC GTCTAATGAT TTTTTAAACG CATAAATAGT 2 8 20 
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AAGTTCTTCT TTAGTACGTT TGATTTTAGA GTTAGCAACA CCATTGTCCA CGTCTATAAT 294 0 

AGCTTTGGCT TTAGACCTAT TTAATGCTTC GAGACTAGCT TTAGATACTT TTAACACTCG 3000 

ATTGAATTTA CTGTTATCTG CATTGACGTC AATATTGACA CGTTTCTTTT CTAATTCTGA 3060 

TAATTTAGCT TCTGTTTCAG CGATATCTTT AATCAACTTT TGTTTTTGCA ACTTAACTTC 3120 

TGGTGTAACT TCTTTAGAGT TTAGTTTGTC TAGTTCAAAA TTCGATTCTA GTACCTTTTG 3180 

TTGTAAATCT TGTATACTAG CATCTAATTT AGCTTTTACA TTTTTGTTAC TAAAGGCATC 3 24 0 

TAAAGACTTT TTAGCAACTT TGATAGTTTT TTGTAAATTT TT AT CGTTAG CGTTTAATTC 3 3 00 

75 AACATCTTTA GTTTGATCTG CTACTCGTTT AAATCTTTGC ACAGACTTAA CCGCACTATC 3360 

AATTTGCCTT TTGAATTTGG CTACACTAGC TTCAATAGTC GCTTTAATTT TATATTCCGT 3420 

CACATTAACA CCTCTCTTTC TATTG CTTAT TAAATTCTGC TATAACTTTA AAGAATTCAT 34 80 

20 TATTTTGTGG TT CGTATTCA TCACGTTCGC TACTAAATCT TATATCTTTA CCTTCGTTAA 354 0 

GCGGTTGGAT ATTTTCTTCA TAAGGCAATA CGTCGTTTGC ATTGTTAAAA ACATATTCCT 3600 

CTTTAGGTTT ATTTTCTGTC CCAACATTTT TAGTAGCTGC AGCATCACGA ATAGCAAACG 3660 

CAAGTTTGTA ACGTTCGAAT TCTTGGGTTA GCATTTCATA CTCTTTCGCA TACATTCGAT 3720 

AGTTATATTC TGTTAATGTC ATTTGCTCAA TAACGTTCAA ATCTGTAATA CCAAGTGTTG 3780 

ACATACAAGT TATAACGATT CTGTCGTAAG TTATTAGGcT TCCGCTGGTT TTTCTTCCGT 3 84 0 

TTCCACTACT TCGACTAGGT TTCGGGTCAT AGGTCGCTTT CCCAAcTCCG TTAAAATATC 3 900 

CGAACCGAAT TCTTCTAGTC CGATATTTTC TGCGATTTCA TCTAATGCTT CATCAATGTT 3960 

35 ATTAATAGTA ATTGCTTGTT TTTTTAAGTG AGATGTAGCT GCGATTAAAA cTTCGCCAAT 4 02 0 

CACAACCGGA TTTCCACTTT CTAAACCTAC AGGCAACATT GATACACCTT GAC CGATAG A 4 080 

AGCTTGTTCA ACTTTTAAAC CTAATCGGTT ATCGATTTCT CTTAAAAATT TAAAACCAAA 414 0 

ACTTAATTCT AATGACTTTC CGTTAATTTC TACATTCATA ACTTAAAATC TCCATTCATA 4 200 

ATTAATTTAA ACAAAATAAA mArGCTTAAC GCCCTATTTT TATACCTCTC TTGGTGCAAC 4260 

CGGTGGTGAA TCTACTTTAG GTTGTGGAAT TGCTGTTAAA TCTTCGCCAG TTAATGCATC 4 320 

TGCTTTTGTA GTGTCGTGGA ATCTGTATcC AGTCGCCTTA AGTTTCTTTG TTACAGCCTC 4 3 80 

AGGTAGTGTT GCAAATCCAC GTTGGAAACG AC CATTCACT C CAT ATTCAT ATTCATATTC 44 4 0 

ATCAATACCG TTAGCTTCTG CTTTTAATTC AAATTTATTG TGGAAACCTT GGAAATATTT 4 500 

CGCTTTAAAT TTAGCGGAAT CCCCATTTTT GCCTGGTATT CTACTTTCAA CTTCCCAAGC 4 560 

TTCATACAAT ACGCGATCTA CAACTGCATC TTCAATTTCA TCTGCAAAAT CGT CACCAT A 4620 
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GTCCATTGTA TCCTCTGTAT CTGTATCAGC TTCATGTGAT AAGCCGTATT CAGTTAAAAA 4740 

AAGCATTTTA GTAGCATCTA CTTTTT CGCC AGCTTTTCTA AATAAAATAA TACGATCATT 4 800 

ACTATTTTTC ATATTTGCCA TTCAATATTC CTCCGTTTTT TAAAATGTTT TGTAAGATAT 4860 

CGTTACTGAT GTGTGTAGCA ATTCTTGATT GGTAGTATCA TCAACTAACT GTGTGATGTT 4 920 

AGTATCTTCT TCTTCAAAGT CATAATCGTT TGTTTTAACG CTAGGTGTTA AATCATCAAT 4 980 

ACATCTTTTA ACAAGTCCGT CATGATGTCC TAAATCATCG CTTACACTCC AAATATCAAT 504 0 

AACTAAATTC GTATCGCCAG AATAACTATC AAACGTGTAC TTACTTCTAT TTGACTCCGG 5100 

J5 CATTTTTATT ACAAAAAAAG GATACGGAAT CTCTTGTTGC ATCTCTTTAC GAGAAATAAC 5160 

AGGGAATCCA TATCCTTGTA GCGTTTCATA CGCTTTATTA TAAAGTTGTA AGTTCGGTGT 5220 

CATGCTTTTA TCTCCTATTC AAACAACGCT TTCAATTCTT CTACAGTTGA TTTCCTAATC 52 80 

20 ACTTCGTATA CCGGCCACAT AAAAGGTTCA GCCTCCATGT ATCGAGTAGC AAATTCTAAG 534 0 

AAAC CACT AT AAGCTGCGTG CGATGTGATA GTGTATTGCA AATCGC CAGT TTTTTTATAT 5400 

CTGATATTGC GTGAT aAATT ACC 5423 
(2) INFORMATION FOR SEQ ID NO: 25: 



25 



(i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 6251 base pairs 

(B) TYPE: nucleic acid 
30 (C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

AAACGCAGAT GTTCAATTAG AACCAGTCTA TCGTATTAAG GAAGGTATTA AACAAAAGCA 60 

AATACGAGAC CAAATTAGAC AAGCGTTAAA TGATGTGACA ATTCATGAAT GGTTAACTGA 120 

TGAACTAAGA GAAAAATATA AATTAGAGAC CTTGGACTTT ACTTTGAACA CATTACATCA 180 

TCCTAAAAGT AAAGAGGATT TATTACGTGC TCGTAGAACC TATGCATTTA CTGAACTGTT 240 

TTTATTCGAA TTACGTATGC AATGGCTAAA TAGATTAGAA AAGTCATCTG ACGAAG CAAT 3 00 

TGAAATTGAT TATGACATAG ACCAAGTTAA ATCATTTATT GATCGTTTAC CTTTTGAACT 3 60 

AACTGAAGCA CAGAAATCCA GTGTTAATGA AATTTTTAGA GATTTAAAAG CACCAATACG 420 

TATGCATCGA TTACTTCAAG GTGATGTAGG TTCAGGAAAA ACAGTAGTTG CTGCAATTTG 4 80 

TATGTATGCG TTAAAAACTG CTGGTTATCA ATCAGCATTG ATGGTACCAA CTGAAATTTT 54 0 

AGCAGAGCAA CATGCTGAAA GTTTAATGGC TTTATTTGGA GATTCTATGA ACGTTGCATT 600 
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TACGATTGAT TGTTTAATTG GAACCCATGC TTTGATTCAA GATGATGTGA TTTTCCATAA 72 0 

TGTTGGTTTA GTAATTACAG ATGAACAACA TCGATTTGGT GTGAATCAAC GCCAGCTTTT 780 
AAGAGAAAAA GGTGCAATGA CGAATGTGTT ATTTATGACA G CAACGCCG A TACCAAGAAC 84 0 

ACTAGCAATA TCAGTTTTTG GTGAGATGGA TGTGTCTTCA ATTAAACAAT TACCAAAAGG 900 
TCGTAAACCT ATCATTACTA CTTGGGCAAA GCATGAGCAA TACGATAAAG TTTTGATGCA 96 0 

AATGACCTCA GAGTTGAAAA AAGGTCGTCA AGCATATGTC ATTTGCCCGC TAATAGAAAG 102 0 

TTCTGAGCAT CTCGAAGATG TTCAAAATGT TGTCGCATTG TACGAGTCTT TACAACAGTA 108 0 

15 TTATGGTGTT TCCCGTGTAG GGTTATTGCA TGGTAAGTTA TCTGCCGATG AAAAAGATGA 114 0 

GGTCATGCAA AAGTTTAGTA ATCATGAGAT AAATGTTTTA GTTTCTACTA CTGTTGTTGA 12 00 

AGTAGGTGTT AATGTACCGA ATGCAACTTT TATGATGATT TATGATGCGG ATCGCTTTGG 1260 

ATTATCAACT TTACATCAGT TACGCGGTCG TGTAGGTAGA AGTGACCAGC AAAGTTACTG 13 20 

TGTTTTAATT GCATCCCCTA AAACAGAAAC AGGAATTGAA AGAATGACAA TTATGACACA 13 80 

AACAACGGAT GGATTTGAAT TGAGTGAACG AGACTTAGAA ATGCGTGGTC CTGGAGATTT 144 0 

CTTTGGTGTT AAACAAAGTG GaTTGCCAGA TTTCTTAGTT GCCAATTTAG TTGAAGATTA 1500 

TCGTATGTTA GAAGTTGCTC GTGATGAAGC AGCTGAACTT ATTCAATCTG GCGTATTCTT 1560 

TGAAAATACG TATCAACATT TACGTCATTT TGTTGAAGAA AATTTATTAC ATCGTAGTTT 1620 

TGACTAATTG CCATG CTGAT TTGTCAATTT GAGTGCAACa CTTCGTTAAT TGAGTGATAT 16 80 

GACACTTGAA CTATTTAAAT GTAAAGTGGT ATTTTAACAA TTTATAAATT TTCGACTAAA 174 0 

35 TAATAGCTAA ATATTACAGT TATTTGTTGA GTCGGTTAAA TAGAAAGTGT TATGATATGT 1800 

.GAGGAATGTT TAAGACTAGG TACTAAAAAA TGAGGGGTGA GACGTTGAAA CTAAAGAAAG 18 60 

ATAAACGTAG AGAAGCAATC AGACAACAAA TTGATAGCAA TCCCTTCATC ACAGACCATG 192 0 

AACTAAGCGA CTTATTTCAA GTGAGTATAC AAACAATTCG TTtAGaTCGC ACTTATTTAA 1980 

ACATACCAGA ATTAAGGAAG CGTATTAAAT TAGTTGCTGA AAAGAATTAT GACCAAATAA 204 0 

GTTCTATTGA AGAACAAGAA TTTATTGGTG ATTTGATTCA AGTCAATCCa AATGTTAAAG 2100 

CGCAATCAAT TTTAGATATT ACATCGGATT CTGTTTTTCA TAAAACTGGA ATTGCGCGTG 2160 

GTCATGTGCT GTTTGCTCAG GCAAATTCGT TATGTGTTGC GCTAATTAAG CAACCAACAG 2 22 0 

SO TTTTAACTCA TGAGAGTAGC ATTCAATTTA TTGAAAAAGT AAAATTAAAT GATACGGTAA 2 2 80 

GAGCAGAAGC ACGAGTTGTA AATCAAACTG CAAAACATTA TTACGTCGAA GTAAAGTCAT 2 34 0 

ATGTTAAACA TACATTAGTT TTCAAAGGAA ATTTTAAAAT GTTTTATGAT AAGCGAGGAT 24 00 
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TTAGAAGCCG TACAAAAGGC TGTTGAAGAC TTTAAAGATC TAGAAATTAT ACTTTTCGGT 2520 

GACGAAAAAA AGTATAATCT GAACCATGAA CGAATCGAAT TTAGACATTG TTCTGAAAAG 2580 

ATTGAAATGG AAGATGAGCC TGTTAGAGCG ATTAAACGTA AAAAAGATAG CTCAATGGTA 2640 

AAAATGGCTG AAGCTGTGAA ATCTGGTGAA G CAG ATGGAT GTGTGTCAGC AGGTAATACT 2700 

GGTGCTTTAA TGTCAGCTGG TTTATTCATT GTTGGACGTA TTAAAGGTGT AGCTAGACCG 276 0 

G CTTT AGTAG TAACATTGCC AACGATTGAT GGAAAAGGTT TTGTCTTTTT AGACGTTGGT 2820 

GCAAATGCTG ATGCTAAACC TGAACACTTA TTACAGTATG CGCAACTAGG GGATATTTAT 2880 

GCTCAAAAAA TTAGAGGTAT TGATAATCCG AAAATCTCAT TATTAAATAT AGGAACCGAG 2940 

CCAGCTAAAG GTAATAGTTT AACGAAAAAA TCATATGAGT TATTAAATCA TGATCATTCA 3000 

TTGAATTTTG TTGGGAATAT TGAAGCGAAG ACATTAATGG ATGGCGATAC AGATGTTGTA 3 060 

GTTACCGATG GCTATACTGG GAACATGGTC CTTAAAAATT TAGAAGGTAC TGCAAAATCA 3120 

ATCGGTAAAA TGTTAAAAGA TACGATTATG AGTAGTACTA AAAATAAATT AGCAGGTGCA 3180 

ATATTGAAGA AAGATTTAGC TGAATTCGCT AAAAAGATGG ATTACTCAGA ATACGGTGGT 3240 

TCCGTATTAT TAGGATTGGA AGGTACTGTA GTTAAAGCAC ACGGTAGTTC AAATGCTAAA 3300 

GCTTTTTATT CTGCAATTAG ACAAGCGAAA ATCGCAGGAG AACAAAATAT TGTACAAACA 3360 

ATGAAAGAGA CTGTAGGTGA At CAAATGaG TaAAACAGCA ATTATTTTTC CGGGACAAGG 3420 

30 

TGCCCAAAAA GTTGGTATGG CGCAAGATTT GTTTAACAAC AATGATCAAG CAACTGAAAT 34 80 

TTTAACTTCA GCAGCGAACA CATTAGACTT TGATATTTTA GAGACAATGT TTACTGATGA 354 0 

35 AGAAGGTAAA TTGGGTGAAA CTGAAAACAC ACAACCAGCT TTaTTGaCGC aTAGTTCGGC 3600 

ATTATTAGCA GCGCTAAAAA ATTTGAATCC TGATTTTACT ATGGGGCATA GTTTAGGTGA 3660 

ATATTCAAGT TTAGTTGCAG CTGACGTATT ATCATTTGAA GATGCAGTTA AAATTGTTAG 3720 

AAAACGTGGT CAATTAATGG CGCAAGCATT TCCTACTGGT GTAGGAAGCA TGGCTGCAGT 3780 

ATTGGGATT^ GATTTTGATA AAGTCGATGA AATTTGTAAG TCATTATCAT CTGATGACAA 384 0 

AATAATTGAA CCAGCAAACA TTAATTGCCC AGGT C AAATT GTTGTTTCAG GTCACAAAGC 3900 

TTTAATTGAT GAGCTAGTAG AAAAAGGTAA AT CATT AGGT GCAAAACGTG TCATGCCTTT 3960 

AGCAGTATCT GG AC CATT C C ATTCATCGCT AATGAAAGTG ATTGAAGAAG ATTTTTCAAG 4 020 
TTACATTAAT CAATTTGAAT GGCGTGATGC TAAGTTTCCT GTAGTTCAAA ATGTAAATGC 4 080 

GCAAGGTGAA ACTGACAAAG AAGTAATTAA ATCTAATATG GTCAAGCAAT TATATTCACC 4140 
AGTACAATTC ATTAACTCAA CAGAATGGCT AATAGACCAA GGTGTTGATC ATTTTATTGA 4 200 
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AACATCAATT CAAACTTTAG AAGATGTGAA AGGATGGAAT GAAAATGACT AAGAGTGCTT 4 320 

TAGTAACAGG TGCATCAAGA GGAATTGGAC GTAGTATTGC GTTACAATTA GCAGAAGAAG 4 3 80 

GATATAATGT AGCAGTAAAC TATGCAGGCA GCAAAGAGAA AGCTGAAGcA GTAGTCGAAG 4 44 0 

AAATCAAAGC TAAAGGTGTT GACAGTTTTG CGATTCAAGC AAATGTTGCC GATGCTGATG 4 500 

AAGTTAAAGC AATGATTAAA GAAGTAGTTA GCCAATTTGG TTCTTTAGAT GTTTTAGTAA 4 560 

ATAATGCAGG TATTACTCGC GATAATTTAT TAATGCGTAT GAAAGAACAA GAGTGGGATG 4 620 

ATGTTATTGA CACAAACTTA AAAGGTGTAT TTAACTGTAT CCAAAAAGCA ACACCACAAA 4 6 80 

, 5 TGTTAAGACA ACGTAGTGGT GCTATCATCA ATTTATCAAG TGTTGTTGGA GCAGTAGGTA 4 74 0 

ATCCGGGACA AG CAAACTAT GTTGCAACAA AAGCAGGTGT TATTGGTTTA ACTAAATCTG 4 800 

CGGCGCGTGA ATTAGCATCT CGTGGTATCA CTGTAAATGC AGTTGCACCT GGTTTTATTG 4 86 0 

20 TTTCTGATAT GACAGATGCT TTAAGTGATG AGCTTAAAGA ACAAATGTTG ACTCAAATTC 4 920 

CGTTAGCACG TTTTGGTCAA GACACAGATA TTG CT AAT AC AGTAGCGTTC TTAGCATCAG 4 980 

ACAAAGCAAA ATATATTACA GGTCAAACAA TCCATGTAAA TGGTGGAATG TACATGTAAT 504 0 

AT ATTTGAG C TAAAGCTCAT TGACGCAGTG GTTGACTGGT CATCCAATGG AGAATTGTCT 5100 

GACCTAGTCA ACTTTGCGGG GGAAATTCTA AGCAACCTAG ATAAGGTTCC AGAATTTCTC 5160 

CCTAAGAAAC ACTAATCAAT aAATTGwTAA GTGTTTCTAA AATTTCTACT TGTTTTTTAG 5220 

AATTTAAAAT GGGAAAATAT AG TAGTCTAT GTATAGGCAT TTTTAAAGGA GGTGAATCGA 5280 

CGTGGAAAAT TTCGATAAAG TAAAAGATAT CATCGTTGAC CgTTTAGGTG TAGACGCTGA 534 0 

35 TAAAGTAACT G AAG ATG CAT CTTTCAAAGA TGATTTAGGC GCTGACTCAC TTGATATCGC 5400 

TGAATTAGTA ATGGAATTAG AAGACGAGTT TGGTACTGAA ATTCCTGATG AAGAnGCTGA 5460 

AAAAfitCAAC ACTGTTGGTG ATGCTGTTAA ATTTATTAAC AGTCTTGAAA AATAATAAAT 5520 

CTTACATCTG GGTCGTCAGT ATTGTCGACT CAGTTTTTTT CTTTAATTAT CAATAGTTTT 5580 

AACGTAAAAT TAAAGATGAT TCAAGAGCAA CACATAAAGG AGATAAAATA ATGTCTAAAC 564 0 

AAAAGAAAAG TGAGATAGTT AATCGTTTTA GAAAGCGCTT TGATACTAAA ATGACAGAGT 5700 

TAGG C TTTAC TTATCAAAAT ATTGATTTAT ACCAACAAGC ATTTTCG CAT TCGAGTTTTA 5760 

TTAATGATTT TAATATGAAT CGTTTAGACC ATAATGAGCG TTTAGAGTTT TTGGGTGATG 5820 

50 CGGTATTAGA ATTGACGGTT TCACGATATT TATTTGATAa ACATCCCAAC TTGCCAGAAG 5880 

GGAATTTAAC AAAAATGCGT GCCaCTATTG TATGTGAGCC CtCACTkGTA ATATTTGCGA 5 94 0 

ATAAAATTGG ATTGAACGAA ATGATTTTAC TTGGTAAAGG TGAAGAGAAA ACAGGGGGAC 6000 
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ATCAAGGACT AGATATAGTT TGGAAATTTG CTGAGAAAGT CATTTTCCCA CATGTAGAAC 6120 

AAAATGAGTT ATTAGGCGTG GTAGATTTTA AAACACAATT CCAAGAATAT GTGCACCAGC 6180 

AAAATAAAGG TGATGTAACC TATAATTTAA TAAAAGAAGA GGGACCGGCA CATCATCGTC 624 0 

TATTCACTTC A 6251 
(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS : 

{A) LENGTH: 4 92 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
/5 (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 



20 


AO-iACTGAA GuG^iAATT 




ACTAAGCACT 


GAAATTGAAC 


GTCTTAAAGA 


60 




AGATAAAAAA CAACTTGAAA 


AAGTAATCGA 


AGAGAGaGAT 


ACTAATATTA 


AGTCTTATCA 


120 


25 


AGACGTGgCA TCAATCTGTA 


AGTGaTGCTT 


TGATACAAGC 


TCAAAAAGCT 


GGTGAAGAAA 


180 


CTAAGCAAGC TGCAGAGAAA 


CAAGCTGAAG 


CGATTATAGC TAAGGCAGAA GCGCAAgcTA 


240 




ATcAAATGGT TGGTGACGCG 


GTAGAAAAAG 


CACGCCGTTT 


AG CATTCCAG 


ACTGAAGATA 


300 


30 


TGAAACGTCA ATCAAAAGTA 


TTTAGATCGC 


GTTTCCGTAT 


GTTAGTTGAA 


GCGCAATTAG 


360 




ACTTATTAAA AAACGAAGAT 


TGGGATTACT 


TGTTGAATTA 


TGATTTAGAC 


GCTGAACAAG 


420 




TGACGCTTGA AAATATTCAT 


CATTTGCATG 


AAAATGATTT 


AAAGCCAGAT 


GAAGTTGCAG 


480 


35 


CAAATGCACA AAATAATGCA 


TCAAATACAC 


CAGACAATAA 


TCAACAATCC 


AATGATTCAG 


540 




AAACAACTAA GAAGTAAGAA 


TTAAATAAAG 


ACAGACGCGT 


AATATACATT 


TAACTTTTCA 


600 




CAGCGAATTA GGTAATGGTG 


AGAGCCTAGT 


AAAAGCATGT 


ATGTTATATC 


ACTGGCTTTT 


660 


40 


TAATATTTAA ATAATGTAAT 


GAGAGAACTC 


TAAGTTGAGT 


TAATAAGGGT 


GGTACCG CGA 


720 






AACTTAGAGT 




TTTAAGGAGT 


GAAAAAAATG 


780 




GCAATCGTCC CTTTTAATTT 


ittttaaatt 


45 


G ATT ACAAAG AAACGTTATT 


AATGCCTAAA 


ACAGATTTCC 


CAATGCGAGG 


TGGTTTACCA 


840 


AACAAGGAAC CGCAAATTCA 


AGAAAAATGG 


GATGCAGAAG 


ATCAATACCA 


TAAAGCGTTA 


900 




GAAAAAAATA AAGGTAACGA 


AACATTCATT 


TTACATGATG 


GCCCACCATA 


CGCGAATGGT 


960 


SO 


AACTTACATA TGGGACATGC 


CTTGAACAAA 


ATTTTAAAAG 


ACTTTATTGT 


ACGTTATAAA 


1020 




ACTATGCAAG GGTTCTATGC 


ACCATACGTA 


CCAGGTTGGG 


ATACACATGG 


TTTACCAATT 


1080 




G AACAAG CAT TAACGAAAAA 


AGGTGTTGAC 


CGAAAGAAAA 


, TGTCAACAGC 


TGAATTCCGT 


1140 
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TTAGGTGTTC GTGGTGACTT TAATGATCCA TATATTACAT TAAAACCTGA ATACGAAGCT 126 0 

GCACAAATTC GTATTTTTGG AGAAATGGCA GATAAAGGTT TAATTTATAA AGGTAAAAAG 132 0 

CCAGTTTATT GGTCTCCTTC AAGTGAGTCT TCATTAGCAG AAGCAGAAAT TGAATATCAC 13 80 

GATAAACGTT CAGCATCAAT TTACGTTGCA TTTGACGTTA AAGATGACAA AGGTGTCGTT 144 0 

GATGCAGATG CTAAATTTAT TATCTGGACA ACAACGCCAT GGACAATTCC ATCAAATGTT 1500 

GCGATTACCG TTCATCCTGA ATTAAAATAT GGTCAATACA ATGTAAATGG cGAAAAATAT 1560 

ATTATTGCAG AAGCCTTGTC TGACGCTGTA GCAGAAG CAC TGGaTTGGGA TAAAGCATCA 1620 

15 ATCAAATTAG AAAAAGAATA CACAGGTAAA GAATTAGAGT ATGTTGTAGC ACAACATCCA 1680 

TTCTTAGACA GAGAATCGTT AGTGATTAAT GGTGATCATG TTACTACAGA TGCTGGTACA 1740 

GgTTGTGTAC ATACAGCACC AGGTCACGGG GAAGATGACT ATATTGTTGG TCAAAAATAT 1800 

GAATTGCCAG TAATTAGTCC AATCGATGAT AAAGGTGTAT TTACTGAAGA AGGCGGCCAA 1860 

TTTGAAGGGA TGTTCTATGA TAAAGCTAAT AAAGCCGTTA CTGATTTATT AACAGAAAAA 1920 

GGTGCACTAT TAAAATTAGA CTTTATTACA CATAGCTATC CACACGACTG GAGAACAAAA 1980 

AAACCTGTAA TCTTCCGTGC TACACCACAA TGGTTTGCCT CAATCAGTAA AGTAAGACAA 2040 

GATATTTTAG ATGCAATCGA AAATACAAAC TTCAAAGTAA ATTGGGGTAA AACACGTATT 2100 

TACAATATGG TTCGTGACCG TGGCGAATGG GTTATTTCTC GTCAACGTGT GTGGGGTGTA 2160 

CCGTTACCAG TATTTTATGC TGAAAATGGC GAAATTATCA TGACGAAAGA AACAGTGAAT 2220 

CATGTTG CTG ATTTATTTGC AGAACACGGT TCAAATATTT GGTTTGAAAG AGAAGCGAAA 2280 

35 GACTTACTAC CAGAAGGATT TACACATCGA GGCAGCCCTA ACGGTACATT TACTAAAGAA 234 0 : 

ACAGACATTA TGGACGTTTG GTTTGATTCT GGTTCATCAC ACCGTGGCGT GTTGGAAACA 2400 

AGACCGGAAT TAAGTTTCCC AGCGGATATG TATTTAGAAG GTAGTGACCA ATATCGTGGT 2460 

TGGTTCAACT CTTCTATCAC AACTTCAGTT GCTACAAGAG GAGTATCACC TTATAAATTC 2520 

TTACTTTCTC ATGGTTTTGT TATGGACGGT GAAGGTAAGA AAATGAGTAA ATCTTTAGGT 2580 

AATGTGATTG TACCTGACCA AGTGGTTAAA CAAAAAGGTG CTGATATTGC GAGACTTTGG 264 0 

GTAAGTAGTA CGGACTATTT AGCTGATGTT AGAATTTCTG ATGAAATTTT AAAACAAACA 2700 

TCTGATGTTT ATCGTAAAAT CAGAAATACA TTAAGATTTA TGTTAGGTAA CATTAACGAT 2760 

50 TTCAATCCTG ACACAGATAG CATT C CTG AA TCAGAGTTAT TAGAAGTGGA TCGTTACTTG 2820 

CTAAATCGTT TACGTGAATT TACTGCAAGT ACGATTAACA ACTATGAAAA CTTTGACTAC 2 880 

TTAAATATTT ATCAAGAAGT TCAAAACTTT ATCAATGTTG AGTTAAGTAA TTTCTATTTG 2 94 0 
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CAAACAGTGT 


TATATCAAAT 


TTTAGTTGAT 


ATGACGAAGT 


TGTTAGCACC 


AATCTTAGTG 


3060 


5 


CAT ACAG CTG 


AAGAAGTTTG 


GTCTCATACA 


CCACATGTTA 


AAGAAGAAAG 


TGTTCACTTA 


3120 




GCAGACATGC 


CTAAAGTTGT 


AGAAGTAGAT 


CAAGCTTTAT 


TGGATAAATG 


GCGTACATTT 


3180 




ATGAATTTAC 


GTGATGATGT 


GAACCGTGCA 


TTAGAAACTG 


CTCGTAATGA 


AAAAGTTATT 


3240 


10 


GGTAAATCAT 


TAGAAGCTAA 


AGTTACGATT 


GCTAGTAACG 


ATAAATTTAA 


TGCATCTGAA 


3300 




TTCTTAACTT 


CATTTGATGC 


ATTACATCAA 


TTATTTATCG 


TGTCACAAGT 


TAAAGTTGTA 


3360 




GATAAGTTAG 


ACGATCAGGC 


AACAGCTTAT 


GAACATGGTG 


ATATTGTCAT 


CGAACATGCA 


3420 


IS 


GATGGTGAAA 


AATGTGAAAG 


ATGTTGGAAC 


TATTCAGAGG 


ATCTTGGTGC 


TGTTGATGAA 


3480 




TTGACGCATC 


TATGTCCACG 


ATGCCAACAA 


GTTGTAAAAT 


CACTTGTATA 


ATTGAAATTG 


3540 




TATAAAGTAC 


TCATACAGAT 


GATATAAATT 


AAAGCTCTCT 


TCATAATCAT 


GTTGTAGTTT 


3600 


20 


TTGTTGACAT 


GATGAAGAGA 


GTTTTTTTGT 


GAATAAAAAA 


ATGACCAAGT 


TACCGGTCAT 


3660 




ATATGTAAAA 


AATGTGCGAT 


TTACTAAAAT 


AAAAATTATT 


CAGGAATGGT 


ACAAATTCTC 


3720 


25 


TGAGGCATAT 


AAATGCGTTA 


TAGTTGCTAT 


TCTCAATTAT 


GTTCGCGATA 


ATTTTAAGTA 


3780 


AAAGTAAGCA 


CAGATATTGA 


ATTTGATAGG 


AGTTAATTGA 


ATGTATCATA 


ACAGTAACGC 


3840 




AAACTTTGTC 


AATGGTATCA 


CTTTAAATGT 


GAGAGATAAG 


AATGAATTAA 


AGCCATTTTA 


3900 


30 


TGAGGACATA 


TTAGGATTAA 


ATATTATAAA 


TGAGACATTA 


ACATCGATAC 


AATATGAAGT 


3 960 




AGGTCAAAAT 


AATCATGTCA 


TTACACTTGT 


TGAATTACAA 


AATGGACGTG 


AACCTTTAAT 


4020 




GTCCGAAGCG 


GGACTGTTTC 


ATATCGCAAT 


TAAACTACCT 


CAAATTAGTG 


ATTTAGCTAA 


4080 


35 


TTTACTAATT 


CATTTAAGCG 


AATATGATAT 


TCCAGTTAAC 


GGAGGTATAC 


AGCCTGCTTC 


4140 




GTTATCATTA 


TTTTTTGAAG 


ACCCGGAAGG 


AAACGGTTTT 


AAATTTTATG 


TTGATAAAGA 


4200 


40 


CGAAGCGCAA 


TGGACGAGGC 


AAAATAATTT 


AGTAAAAATT 


GATATTAGAC 


CATTAAATGT 


4260 


ACCGAGATTA 


GTGAGTCATG 


CAACAAAATT 


GTTATGGTTA 


GGTATTCCAG 


ATGACGCTAT 


4320 




TATAGGTGCA 


TTGCATATTA 


AGACAATTCA 


TTTATCAGAG 


GTAAAAGAGT 


ACTACCTCGA 


4380 


45 


TTATTTTGGA 


TTAGAGCAAT 


CGGCATATAT 


GGATGATTAT 


TCAATATTTT 


TAGCATCGAA 


4440 




TGGCTATTAT 


CAACATTTGG 


CCATGAATGA 


TTGGGTATCA 


GCAACGAAAC 


GTGTAGAAAA 


4500 




TTTTGATACG 


TATGGATTAG 


CAATTGTTGA 


CTTTCATTAT 


CCTGAAACAA 


CACATTTAAA 


4560 


50 


TTTACAAGGT 


CCGGATGGTA 


TCTATTATCG 


CTTTAATCAT 


ATCGAAGTTG 


AAGATTAGTA 


4620 
4680 




TATACTTTGA ATGGACGAAC 


CATATAATGA 


ATCGTTTTTA 


ATGATCTTTT 


TATACAAGTT 




ATGAAGGAGG 


CTGGGACATT 


AAGTTCTTAG 


GCAATGTAAA 


AAGCTGATTT 


CTATTAATTA 


4740 
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TTTTCCTTAT ATTAATTGCC ATTAATACAA AACCTAGCTC TCGTTTAACT TTATTTATTC 4 860 

CTCGAACTGA CATTCGnGTG AACTCAAAAT nGCCTACTTn CTTAAATTAC CAATATCTAT 4 920 

(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 626 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



20 
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15 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 

TGGATTGCCA TTACATGGAC AAGATTTAAC TGAATCAATT ACACCATATG AAGGTGGTAT 60 

CGCTTTTGCA AGTAAACCAT TAATTGATGC TGATTTTATT GGTAAATCTG TATTAAAAGA 120 

TCAAAAAGAA AATGGTGCAC CAAGAAGAAC AGTGGGATTA GAATTACTTG AAAAAGGAAT 180 

TGCAAGAACT GGTTATGAAG TTATGGATTT AGATGGAAAT ATTATTGGAG AAGTAACTTC 240 

AGGAACACAG TCT C CAT CAT CAGGAAAATC AATTGCACTT GCAATGATAA AAAGAGATGA 3 00 

GTTTGAAATG GGTAGAGAGT TGCTTGTTCA AGTTCGTAAG CGTCAATTAA AAGCGAAAAT 360 

TGTTAAGAAA AATCAAATTG ATAAATAATT AAAAAGGGGT GTG CATTGTG AGTCATCGTT 420 

ATATACCTTT AACTGAAAAA GACAAGCAAG AAATGTTACA AACAATTGGT GCAAAATCTA 4 80 

TAGGAGAATT ATT CGGTGAT GTACCAAGTG ACATTTTATT AAATAGAGAT TTAAATATTG 54 0 

CTGAAGGCGA ACGGAGAACA ACGTTACTTA GAAGATTnAA TCGCATTGCA AGCAAGAGTA 6 00 

35 TCACTAGAGG AACGCGTACA TCGTTT 626 

(2) INFORMATION FOR SEQ ID NO: 28: 

r<i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 1126 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

45 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 
nGGAAGTGGT GTATATATTT GTAATGAGTG TATTGAATTA TGCTCAGAAA TCGTCGAAGA 60 

so AGAATTAGCT CAAAACACTT CTGAAGCGAT GACAGAATTA CCTACTCCTA AAGAAATT AT 120 

GGATCATTTA AACGAATATG TTATTGGTCA AGAAAAAGCT AAAAAATCTT TAGCTGTAGC 180 

TGTTTATAAC CACTATAAGC GTATTCAACA ATTAGGACCA AAAGAAGATG ATGTTGAATT 240 

ss 
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AACCTTAGCC AAGACGTTGA ATGTACCATT TGCAATTGCA GATGCGACAA GTTTAACTGA 360 

AGCTGGTTAT GTAGGCGATG ATGTTGAAAA TATCTTGTTG AGATTAATTC AAGCAGCTGA 420 

5 

CTTTGACATT GATAAAGCCG AAAAAGGTAT TATTTATGTA GATGAAATTG ATAAAATTGC 480 

ACGTAAATCT GAAAACACAT CTATAACACG TGACGTTTCA GGTGAAGGTG TTCAACAAGC 54 0 

w ATTGCTTAAA ATCTTAGAAG GTACGACTGC AAGTGTTCCG CCACAAGGTG GACGCAAACA 600 

TCCAAACCAA GAAATGATTC AAATTGATAC AACAAATATC TTATTTATTC TTGGTGGTGC 660 

CTTTGATGGT ATTGAAGAAG TGATTAAGCG CCGTCTTGGT GAAAAAGTTA TTGGTTTCTC 720 

15 AAGCAATGAA GCTGATAAAT ATGACGAACA AGCATTATTA GCACAAATTC GCCCAGAAGA 780 

TTTGCAAGCC TATGGTTTGA TTCCTGAATT TATCGGACGT GTGCCAATTG TAG CT AATTT 840 

AGAAACATTA GATGTAACTG CGTTGAAAAA CATCTTAACG CAACCTAAAA ATGCACTTGT 900 

GAAACAATAT. ACTAAAATGC TGGAATTAGA TGATGTGGAT TTAGAGTTCA CTGAAGAAGC 96 0 

TTTATCAGCA ATTAGTGAAA AAGCAATTGA AAGAAAAACA GGTGCGCGTG GTTTACGTTC 1020 

AATCATAGAA GAATCGTTAA TCGATATTAT GTTTGATGTG CCTTCTAACG AAAATGTAAC 1080 

GAaGGTAGTT ATTACAGCAC AAACmATTAA TGrAGaACTG AACCAG 1126 
(2) INFORMATION FOR SEQ ID NO: 29: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

( D ) TOPOLOGY : 1 inear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

ATTGACTTCT TAGCAATnAA TaTGAGTGAA GAACGTACTG TTGAAGTACC AGTTCAATTA 60 

GTTGGTGAAG CAGTAGGCGC TAAAGAAGGC GGCGTAGTTG AACAACCATT ATTCAACTTA 120 

GAAGTAACTG CTACTCCAGA CAATATTCCA GAAGCAATCG AAGTAGACAT TACTGAATTA 180 

AACATTAACG ACAGCTTAAC TGTTGCTGAT GTTAAAGTAA CTGGCGACTT CAAAATCGAA 24 0 

AACGATTCAG CTGAATCAGT AGTAACAGTA GTTGCTCCAA CTGAAGAACC AACTGAAGAA 300 

GAAATCGAAG CTATGGAAGG CGAACAACAA ACTGAAGAAC CAGAAGTTGT TGGCGAAAGC 360 

AAAGAAGACG AAGAAAAAAC TGAAGAGTAA TTTTAATCTG TTACATTAAA GTTTTTATAC 4 20 

TTTGTTTAAC AAGCACTGTG CTTATTTTAA TATAAGCATG GTGCTTTTTG TGTTATTATA 4 80 

AAGCTTAATT AAACTTTATT ACTTTGTACT AAAGTTTAAT TAATTTTAGT GAGTAAAAGA 54 0 
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CTTACTAAGC 


TAAAGAATAA 


TGATAATTGA 


TGGCAATGGC 


GG AAAATGGA 

UV3/V1AA X w\J#\ 


^H^^. J i J k VMTP^^ & HP^P 
lVj 1 X vj 1 V^M. 1 1 


660 


5 


ATAATAATAA 


ATGAAACAAT 


TATGTTGGAG 


GTAAACACGC 


ATGAAATGTA 


x x vj x nuu iwi 


~W "» 

72 0 




AGGTAATATA 


GGTAAACGTT 


TTGAACTTAC 


AAGACATAAT 


ATCGGCTTTG 


AAGTCGTTGA 

IWJ X NrVJ X X \Jf\ 


T O A. 




TTATATTTTA 


GAGAAAAATA 


ATTTTTCATT 


AGATAAACAA 


AAGTTTAAAC5 


mY^PATATAP 


84 0 


10 


AATTGAACGA 


ATGAACGGCG 


ATAAAGTGTT 


ATTTATCGAA 


CCAATGACAA 


TGATGAATTT 

X \An. X W/V\ XXX 


Q A A 

y oo 




GTCAGGTGAA 


GCaGTTGCAC 


CGATTATGGA 


TTATTACAAT 


G TTAATC CAf3 
vj x x nn x \v unu 


AA^ATTTfiAT 


/T 

So 0 




TGTCTTATAT 


GATGATTTAG 


ATTTAGAACA 


AGGACAAGTT 


X 


A A A & AfTfl A Z\f2 
AMAAAuvxAAv) 


1020 


15 


TGCGGGCGGT 


CACAATGGTA 


TGAAATCAAT 


TATTAAAATG 


W X X A AwlU 


nLwvil X InA 


lUbU 




ACGTATT CGT 


ATTGGTGTGG 


GAAGACCAAC 


GAATGGTATG 


A PfVlT A f^rrrj 

/av»\«7\j x ASvV. x Vj 


& R ^P^^T^^P^P^P 

A 1 x A J. Vj 1111 


1140 


20 


ACAACGCTTT 


TCAAATGATG 


AAATGGTAAC 


GATGGAAAAA 


GTT aTPfi A A 
w X X A X UVJ/vi^ 


A V.. VjrS^Avj V_AL. \j 


1200 


CGCAATTGAA 


AAGTTTGTTG 


AAACATCACG 


ATTTGACCAT 


GTTATGAATG 

\JX Xl^Xw^WLXVJ 


A A'l*l "1*2111 f TYlT' 


12 o O 




TGAAGTGAAA 


TAATGACAAT 


ATTGACAACG 


CTTATAAAAG 


AJVGAT A 1TP A 
AAV*A Inn 1 W*A 


1111 k-AAUAL. 


13 2 0 


25 


CTTAATCAGG 


TATTTGGACA 


AGCAAACACA 


AflX A. A PTrt 

V- A fWJf X /V\v X w 




Vj 1 UAVjL, 1 AAA 






GTGACGATGA 


TTGCTGAAAA 


ATATGCACAA 


AGTAATCAJVP 
■a\j x x ^«nn^ 


nVJx Inl lx\l X 


AA1 1 AL LftAi 


144 O 




AATTTATACC 


AAGCAGATAA 


ATTAGAAACA 




AATTTATAfSA 
AMI 1 InlntJn 


1 VjV_ IvjAAVjAA 


"1 C A f\ 

lb UO 




TTGTATAAGT 


ATCCTGTGCA 


AGATATTATG 


ACCGAAGAGT 


T w n , r*AJv r* a r* a 

XXX UVlLAWl 


A AfS/TVTf^n 7A 
nnuUCW 1 Uin 


1 c C f\ 




CTGATGAGTG 


AACGTATTAG 


AACTTTAACT 


G CGTTAGCTC 


AAGGTAAGAA 
nnvjvjinnvjnn 


AIjV>%j 1 1 A 1 1 1 


icon 




ATCGTTCCTT 


TAAATGGTTT 


GAAAAAGTGG 


TTAACTCCTG 


TTGAAATGTG 




1 fan 


35 


CAAATGACAT 


TGCGTGTTGG 


TGAGGATATC 


GATGTGGACC 


X X l\al X4%f% 


>a*/WW%X XAvl 1 


1 Txl A 




AATATGGGGT 


ACAAACGGGA 


ATCCGTGGTA 


TCGCATATTG 


GTGAATTCTC 


ATTGCGAGGA 




40 


C5GTMTATCG 


ATATCTTTCC 


GCTAATTGGG 


GAACcAATGA 


GAATTGAGCT 


ATTTGATACC 




GAAATTGATT 


CTATTCGGGA 


TTTTGATGTT 


GAAACGCAGC 


GTTCCAAAGA 


TAATGTTGAA 


17xU 




GAAGTCGATA 


TCACAACTGC 


AAGTGATTAT 


ATCATTACTG 


AAGAAGTGAT- CAGCCATCTT 


X 7 O W 


45 


AAAGAAGAGT 


TAAAAACTGC 


ATATGAAAAT 


ACAAGACCCA 


AAATAGATAA 


ATCAGTGCGC 


2040 




AATGATTTGA 


AAGAAACGTA 


TGAAAGCTTT 


AAATTATTCG 


AAAGTACATA 


CTTTGATCAT 


310 0 

« X> W V 




CAAATACTAC 


GTCGCTTAGT 


AGCGTTTATG 


TATGAAACAC 


CTTCGACAAT 


TATTGAGTAT 


2160 


50 


TTCCAAAAAG 


ATGCAATCAT 


TGCAGTTGAT 


GAATTTAATC 


GTATTAAAGA 


AACTGAAGAA 


2220 




AGTTTAACAG 


TAGAGTCTGA 


TTCGTTTATT 


AGCAATATTA 


TTGAAAGTGG 


TAATGGATTT 


2280 




ATAGGACAAA 


GTTTTATAAA 


ATATGATGAT 


TTTGAAACAT 


TGATTGAAGG 


CTATCCTGTC 


2340 
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TCATGTAAAC CTGTCCAACA ATTTTATGGG CAATATGACA TTATG CGTTC TGAATTTCAA 2460 

CGATATGTTA ATCAAAACTA TCATATCGTG GTTTTGGTCG AAACCGAAAC TAAAGTTGAA 2520 

CGTATGCAAG CGATGTTAAG TGAAAtGCAT ATTCCATCAA TAACAAAATT GCATCGCTCA 2580 

ATGTCATCGG GGCAAGCAGT GATTATTGAA GGCAGTTTAT CTGAAGGATT TGAACTACCT 2640 

GATATGGGAT TAGTTGTCAT TACTGAGCGT GAgcTTTTTA AATCAAAACA GAAAAAGCAA 27 00 

CGAAAACGTA CGAAAGCTAT CTCAAATGCT GAAAAAATTA AGTCTTACCA AGATTTAAAT 27 60 

GTGGGAGATT ATATTGTTCA TGTGCATCAT GGTGTTGGTA GATATTTAGG TGTTGAGACG 2820 

CTCGAAGTGG GGCAAACGCA TCGTGATTAT ATTAAATTGC AATATAAAGG TACGGATCAA 2880 

CTATTTGTTC CAGTAGATCA AATGGATCAA GTTCAAAAAT ATGTAGCTTC GGAAGATAAG 2940 

ACGCCAAAAT TAAATAAACT CGGTGGCAGT GAATGGAAAA AAACAAAAGC TAAAGTTCAA 3000 

20 CAAAGTGTTG AAGATATTGC TGAAGAGTTG ATTGATTTAT ATAAAGAAAG AGAAATGGCA 3060 

GAAGGTT AT C AATATGGGGA AGACACAGCT GAGCAAACAA CATTTGAATT AGATTTTCCA 3120 

TATGAACTTA CGCCTGACCA AGCTAAATCT ATCGATGAAA TTAAAGATGA CATGCAAAAA 3180 

TCGCGTCCAA TGGATCGCTT GCTATGTGGT GATGTTGGTT ATGGTAAAAC TGAAGTTGCA 3240 

GTGAGAGCAG CATTCAAAGC TGTAATGGAA GGAAAGCAGG TTGCATTTTT AGTTCCTACA 3 300 

ACTATTTTAG CTCAGCAACA TTATGAGACG TTAATTGAGC GTATGCAAGA TTTTCCTGTT 3360 

GAAATTCAAT TAATGAGTCG TTTTAGAACG CCTAAAGAGA TAAAACAAAC TAAGGAAGGA 3420 

CTTAAAACTG GATTTGTTGA CATAGTTGTT GGTACACACA AATTACTTAG TAAAGATATA 3480 

35 CAG T AT AAAG ATTTAGGGCT GTTGATTGTA GATGAAGAAC AACGATTTGG TGTACGCCAT 3540 

AAAGAGCGTA TTAAAACATT AAAACATAAT GTAGATGTAC TAACATTGAC TGCAACCCCA 3600 

ATAGCTAGAA CATTGCATAT GAGTATGCTA GGTGTGCGGG ATTTGTCAGT GATTGAAACG 3660 

CCGCCAGAAA ATCGTTTCCC AGTTCAAACA TATGTATTAG AACAGAACAT GAGTTTTATC 3720 
AAAGAAGCTT TAGAAAGAGA ACT ATC C CGT GATGGCCAAG TGTTTTATCT TTATAATAAA 3780 
GTGCAATCCA TTTATGaAAA ACGAGAACAA CTCCAGATGT TAATGCCAGA TGCTAACATT 3840 
GCAGTTGCTC ATGGACAAAT GACAGAG CGC GATTTAGAAG AAACGATGTT AAGTTTT AT C 3 900 

AATAATgAAT ATGATATTTT AGTAACGACG ACGATTATTG AAACAGGTGT CGATGTCCCA 
AATGCAAATA CTTTGATCAT TGAAGATGCA GATCGCTTTG GATTGAGTCA GTTGTATCAA 
TTAAGAGGTC GTGTTGGTCG TTCAAGTCGT ATTGGTTATG CATACTTCTT ACATCCAGCA 
AATAAGGTAC TAACTGAGAC TGCAGAAGAT CGATTACAAG CGATTAAAGA ATTTACGGAG 4140 

ss 
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TTAGGTAAAC AACAGCACGG CTTTATTGAT ACAGTTGGAT TTGATTTGTA CAGTCAAATG 4260 

TTAGAAGAAG CTGTAAATGA AAAACGTGGT ATTAAGGAAC CAGAATCTGA GGTGCCAGAA 4 320 

GTCGAAGTTG ATTTAAACTT GGATGCATAT TTGCCAACAG AATATATTGC AAATGAACAA 4380 

GCTAAAATTG AA 4392 
(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 72 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is (D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

TTTCTTTTGA ATCTATATCG AGGTGGTTGG TAGGTTCATC TAAAATAAGT ACATTGTCAC 6 0 

GTTGCAACAT AAGTAGTGCT AGTTGTAAAC GTGCTTTTTC ACCACCAGAT AAATCATTAA 120 

TTATCTTTTT AACATCGTCT TGTACAAATA AGAAACGTCC AAGAACTGCT CGAATATCTT 180 

TTTCATTCAT TAACGGATAT TGATCCCACA CATAATCTAA AATCGTTTTA CTAGATTTAA 24 0 

ATTCTGCTTG CTTTTGATCA TAATAACCAA TTTGTAAATT TGCGCCGAAA GTAATATCGC 3 00 

CATTAAGCGC TTTTTGTTGA TTAGCAATAG TTTTAATTAA GGTCOATTTT CCAATACCAT 3 60 

TTGGCCCAAT GATTGCTATA TGATCGCCTT TAGAGACCTC TATACTCATA GGTTTGGTAA 4 20 

TTGCAGTTTG ATAACCGATT TCTAAATTTT TT ACATG CAT GACGTCATTA CCTGTATTCC 4 80 

35 GGTCAAAGCC AAATTGAATA TTTGCACTTT TGGCATCTAA CATTGGTTTA TCAATGCGTT 54 0 

CCATTTTTTC TAAAATCTTA CGTCTACTTT TTGCCATTCC ACTTGTTGAA GCACGGGTAA 600 

TATTTTTCTC AACAAAAGTT TCTAATCGTT TTATTTCTGC TTGTTGACTT TCATATTCTT 660 

40 

GCATTCGTTT TTGATAATAT AAATCCCGTT GCTGTATAAA TTCCTCGTAA TTACCAACAT 720 

AGCGTTTGA *?2 9 
(2) INFORMATION FOR SEQ ID NO: 31: 

45 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 138S6 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
so (D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
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TGATGTTTCG ATACATTTGT TGCACCTTGT GGATATACTT TAAAGGTTGT GTCGTATGTT 120 

TCCTTACTAT CTTTAGCTTC AGATTCCTGT GATTCAACCG TTTTATATTT TTCAAGTGCA 180 

TGTCCTTCAA TATCAACTCG TGGAATAATG CGATTCAACC ATG CTGGTAA ATACCACGAA 240 

CCTTTtCCAA ACAATTTCGt TAATGCAGGA ATTAACATCA TtCTGACTAC GAAGGCATCA 3 00 

AAGAGTACAC CAAACGCTAA TGCCATACCC ATTGATTTAA TCATGACATC TTCTTGGAAT 360 

ACAAACGCAA AGAAGACACT AAACATAATT AATGCAGCTG CTACAATAAC AGGACCG CTT 420 

TCTTTCAATC CTACTTTGAT AGAATAATCA TTATCCCCTG TTTTACTATm yyCTTCATGr 4 80 

ATTCGCGACA TAAGGAAGAC TTCATAATCC ATCGCTAATC CAAATAAGAT ACCTATAGTA 540 

ATAACCGGTA AAAATG CT AG CATTGGTCCT GTCGTTTCAA TACCAAACAG ACCTTTCATA 6 00 

AAACCATCTT GCATTACTAA TGTTGTAAAT CCTAATGTTG CCATTAATGA CAAGACGAAT 660 

20 CCTAAAACTG CTTTTAATGG TATTAGAATT GAACGGAAGA CAATCATTAA TAAGAAAAAT 720 

GCTAATACAA CAATGACTGA GG CAAATAAA GGTATCGCCT CATTTAACTT TTTAGACATA 780 

TCAATATTAA TGACACTTTG TCCCGAAATC TCCGTTTTGA ACCCATATTT ATCTTGTGCA 840 

TCTTTATGAT AATCTCGTAA ATCATGCACT AAATCATTTG TACTCTCTGC ATTAGGCCCT 900 

TG CTTAGGT A TCACGACCAT CAAAGCGTAA TCATTATCTT TACTCATTTG TGGTGG CGTA 960 

ACGATATCTA CATTTTTCTT ATCTTTAATA TCTTTATATA CAGACTGTAA ATCTTGTTGT 1020 

AATCCTTGTG GATCATCCTT TTTATCTTTC ACATTTATCA ACATCGGTAT TTGGCCATTA 10 80 

AATCCTTCAC CAAATTTATC CGAGATAATA TCGTAAGCTT TTTTCTGTGT AGAATCTG CT 1140 

35 GGTTTAACAC CGTCATCTGG AAT AC CAAGT CGCATATGAC TAACTGGTAT TGCAGCTGCT 1200 

ACTAATATGA TTAAACCTAG TAATACTGCC GCAAGTGCAT TTCCTGTAAT AAATTT AGAC 1260 

CATGGCGTAT CAATATCTTT TTTGAATTTA GACTGTAATT TATTCACTTT AATGCGTTtA 1320 

40 TGGAAAATGC TT ATT AATG C AGGTAATAAA GTTAAAGCGC TAAGTACTGC AAAAACAACA 13 80 

CTAATTGCCG AAGCAAATCC CATTACCGCT AAGAAGTCAA TGCCTACTAA TGATAAACCA 1440 

CATACTGCAA TTACAACTGT TACACCAGCA AAAACAACTG CACTACCTGC TGTTCCTATT 150 0 

GCAAGACCAA TGCCTTTAAT GTAATCTGTT TCAGTTTTCA TAACTTGTCG ATATCTGAAT 1560 

AAAATAAATA ATGCATAATC GATACCAACT G CT AGTCCAA TCATTACGGC TAATGTCAGT 162 0 

GTGACATTTG GT AT AT CG AA TG CAT AAGTT AACAAACTGA TAATACCTAC AC GAG AGGCT 168 0 

AGACCAATCA ATGCACTTAT AATTGGTAAT CCTGCAGCAA TGACTGAACC GAATGTGATT 174 0 

AACAGTACAA CAAATGCAAC AAT AAT AC CA ACTAGTTCAG AATTACCGCC TACTTCTGTA 1800 
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AAATGACTTT T AACATT AT C TCTAGAGCCA TCTTTTAAAG ATGTTTGACT AACGTCATAT 1920 

GTGATATCTG CAAATGCAGT TGTTTTATCT TTACTAATTT GCTTATTTTC ATAAGGATCT 1980 

5 

GATATTTTAT CAATGTGCTT GTCATCTTTT TTAATATCAT CTAACGTTTT CTTAATATCT 2040 

TTAGTAATGT TCGGTTGCAC AATACCATCA TCTTTAGTCG TCTTAAAGAC AACACGTATT 2100 

TGTGCCTTTT CACTATCTTG ATTAAAATGT TTTTCAATCT TTTTATTCGT ATCTAACGAC 2160 

10 

TCTAATCCTG TCATTTTAAT ATCATTGTCA AATTTCGGTG CATTTGTAGC AAGTGGTATC 2220 

AATATTGCAG CTACAATCAC TAT C CATGCA ATGAC CGCGG ACCATTTATG TTTTG CGATG 22 80 

15 AATGTCCCCA TCTTATATAA AAATTTTGCC AAAGTATATT GCCTCCTTTT AAAATCAACG 2340 

TTATAGTTTA AATATACAGT GTAGATTATT GTTCGATTAT AGTATCTATC CCCGACCTCT 24 00 

TAAAGAATCA ATTGGAAAAT TTTGTATATT AAACTACACA CAAAGGAGAA ATGTAGATGA 2460 

20 

AAGAGACTGA TTTACGAGTT ATAAAGACAA AAAAAGCATT GTCGAGTAGC TTGCTACAAT 2520 

TGTTAGAACA GCAATTATTC CAAACGATTA CTGTCAATCA AATTTGCGAC AACGCACTCG 258 0 

TACACCGTAC AACATTTTAT AAACATTTTT ATGATAAATA TGATCTTCTA GAGTACTTGT 2640 

25 

TCAATCAATT GACTAAAGAC TACTTTGCTA GAGATATCAG TGACCGTCTT AATCATCCAT 2700 

TCCAAACGAT GAGTGATACG ATTAATAATA AAGAGGATTT GAGAGAAATC GCAGAATTCC 2760 

3Q AAGAAGAAGA CGCTGAATTT AATAAAGTAT TAAAAAATGT CTGCATTAAA ATTATGCATA 2820 

ACGATATCAA AAATAATAGA GACCGTATCG ATATTGACAG CGACATCCCA GATAATCTCA 2 880 

TATTTTATAT TTATGACTCG TTGATTGAAG GTTTTATACA TTGGATAAAA GATGAAAAAA 294 0 

35 TTGATTGGCC TGGCGAAGAT ATTGATAACA TTTTCCATAG ATTAATCAAT ATTAAGATTA 3000 

AATAGTAGAT GAGAAACTCA TGAGCGTTAC CAACATTCAT AATAAAAACG ATAGTGkACA 3060 

CGTTAATGAA TTCGTGTACT ACTATCGTTT TTTATTTTTA TCGTGCTTAT CGCTATTAAA 3120 

40 

ACAACTGATA CACAACACAT AAACTATGAA GAAAAAAATA AATCCGCTAT CTAAATGACT 3180 

TTGACTCAGT TGTTTAAATG ACCAAATTGC TAATACAATT C CCATTATTA TTGAAATAAC 324 0 

GTATCTCACA TTCTTATACC TATAATCCTT TTCTAAAAAT ATGGTTGCTA TTACTTAATT 3300 

45 

TTTAAAGTTA TAAATAAAAA GAGCCAACCG CAATGGATGG CCCTTGTTCA TTATGAAGCA 3360 

TTAGAACATT TCTGAAACAA CCTTTTGTTC TAAGAAGTGT AATAAGTAGT CTGGACTACC 3420 

SO TGTTTTAGCG TCCGTACCTG ACATTTTGAA ACCACCAAAT GGATGGTATC CAACAACTGC 34 80 

TGAAGTACAG CCTCTGTTAA GGTATAAATT GCCTACATCA AATTCGTTTA CCGCTTTAAT 3 540 

CCAATGCTCG CGATTATTTG TAATCACTGC AC CAGTTAAA CCGTAATCTG TAT CATTTGC 3600 
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TTCTTCTTGC ATGATTCTAT CTTTAGATTT AAGTCCTGAA ATGATTGTTG GTTCTACAAA 3720 

GTAACCTTTT GAATCATCAG TGCCGCCACC TTGTTCTAAT TTACCTT CTT CTTTACCAAT 3780 

CTCAATATAA TTTTTAATCT TATCAAATTG TTTTTTATTA ATAACTGGGC CCATATACGT 3 840 

ATTGTCTACA GTATTGC CCA ACGTTAATTC TTTTGTTAAT TTGATTGATT TCTCTAATAC 3 900 

TTCGTCATAA ACGTCTTTAT GCACAATTGC ACGTGAACAT GCTGAACATT TTTGACCAGA 3 960 

AAAACCAAAT GCTGACGTTA CAATAGCTTC TGCTGCCATA TCTGTATCAA TATTTTCATC 4020 

AACTACAATG GCATCTTTAC CACCCATTTC AGCGATAACA CGTTTCAAGA AGTTTTGACC 4 080 

TTCTTGAACA ACGGCACTAC GTTCATAAAT TCTAGTACCT GTCGCACGTG ATCCTGTAAA 414 0 

TGTAACGAAA TGCGTATCTT TATGATCAAC TAAGTAATCA CCAATTTCTT TCGGATCACC 4200 

AGGAACAAAG TTAACTACGC CTTTTGGTAA TCCTGCTTCT TCTAAAATTT CCATTAATTT 4260 

AT AAG CG AT A TAAGGTGTAT CCTCAGCAGG TTTCAATAAC ACTGTATTAC CTGCCACAAC 4320 

TGGTGCTAAA GTTGTACCAG CCATAATCGC AAACGGGAAG TTCCACGGCG GAATTGTAAC 4380 

ACCTGTACCA ATTGATTTAT AGAAATATTT ATTGTGTTCA CCTTCACGAT CAAGTACTGG 4440 

CTT AC CTT GA GCCAAGTCCA TCATTGAACG TG CAT AGTAT TCAATAAAAT CAATACCTTC 4 500 

AGCTGCATCA CCAACTGCTT CATCCCATGG CTTACCTGCT TCATAAACCA TAATTGCTGC 4560 

AATTTCCGCT TTTCGACGAC GAATAATTGC CGAAACACGT AACATAAGCT CTGCACGATC 4620 

ATTTGCTGAC CATGTTTTCC AAGATTTATA AGCTTCGTTT GCTGCTTTAA ACGCATCTTC 4 6 80 

AACATCTTGT TTTGTTGCCT TTGATGCATT TGCAATCACT TGTGATGTGT CTGCAGGATT 4 74 9 

GATTGATTTA ATTTTGTCAT CTTTGAAAAT CTTCTCTCCA TTAATCACTA ATGGTATGTC 4 800 

TTGACCTAAT TCTTTTTCCA CGTCTTTCAA TGCTTTCTTA AACATATCCA CATTTTCTTG 4 860 

GACTGAAAAA TCGTAACCAG GTTCATTTTT AAATTCTACT ACCATGTACA CTTACCCCCT 4 920 

ATAAATTTTG AAAGTGGTTT AACCCTTTGA TTTAATGATA TAACATCATT TAAACTCATT 4980 

TTACTATGAT TAAGGTTAGT TTTGCAATCG CTTTCATTTT TATGTTTTAT CACTTATTCT 5040 

CAAGTATTTT GAAATTGATT GGTTACTTTT TAAAATTTAT ATGGGTCGCA ACTGCTACTT 5100 

TATCGTTT CG TCATTTAATG TTTCGGATGG TAGGTCATTA TCAATTTTAC GAACGACTTT 516 0 

ACAAGGGTTT CCAACCGCTA AGCTGTGTGG CGGAATATCT TTAGTGACAA CACTACCAGC 5220 

ACCAATCACA CTGCCTTCTC CAATCGTCAC C C CTGGT AAC ACGGCTACAT GACCGCCAAA 5280 

CCAAGTATTA CTGCCAATAT GAATGGGTCC GGCTTTTTCA AAACCTTCAT TTCTATGATG 534 0 

GAAATTAAGT GGATGTGTCG CTGTGTAGAA TCCACAATTA GGTCCTATAA AAACATTATC 54 00 
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TCCTAGTTTA ACGTTCCAAC CATAATCTGT ATCAAAAGGA ATCGAAATAC TTACATTGTC 5520 

TGTTGTTGTT TGAAATAATT GATCAATTAA TTCCTTTCTT TTATTTGTAG CACTCGGTCT 5580 

5 

TGTATGATTT AATTCAAAGC AAATATCTTT CGCTCGTGCA CGTTCATTGA TTAAGTATTG 564 0 

ATCAAAGTTT GCATCGTACC ATTTTTCTGC TAACATTTTT TCTTTTTCAG TCATTACACC 5700 

TTTCAACTCC TAATAACTTA TTTACTTGTT TAAAAGTTAA TCAAATAAAC CTTCGCCTAT 5760 

w 

GCAACTAATA CGCTATAACA TTATGAAATC ATGACCTTAT CACCCTTATC TATACAATTC 5820 

TCG CATCAAA TACTGCTAAA GTAGTAGATA AATTCAATAC TACAGACGCA TTCATTTTTT 5880 

1 5 AATCTATTAA CGTACAATGT GAGTAAGAGA AATATAAAGG AGTATGATAG CGATGAGAAT 594 0 

ATTAATTACA GGCACAGTTG CTATCTTAAT CATTCTAGGT TTGGTCAAAA CGATACAAGA 6000 

TTACGAAATG ACAAACGACA CGAGTCGTcA GTTGTCAGAC AACAAAGATG ATGATAAAGT 6060 

20 

CATCCATCTT AATAATTTTA AAAATTTACA TGCGAAAGAA TTTAACCCAT CTGATTTCTT 6120 

TTAAGTCACC TAAGAATTGC AAATCCAGAA GTCATTTAAG TTTTACCTTT CATTCATACA 6180 

TC CTTTAAT A TTAATTACGA CTTCTTTTAT ATAGATGCTA AGTAGAGAGA TTGTTGTGCA 624 0 

25 

ATGTTTGCAC GGCAATCTCT CTTTTTCTTT TTAAAATTGG TAAAAGTAAA ACGCAACGAT 6300 

TGACTTATAT ACCTATAGGG GGTACATTAG ACGTGTAACA ATGAATCACA GGGAGGCAAT 6360 

30 AATGTGGCTA ATACGAAAAA AACAACATTA GATATCACTG GTATGACTTG TGCCGCATGT 64 20 

TCAAATCGTA TCGAAAAGAA ACTGAATAAA CTTGATGACG TTAATGCCCA AGTGAATTTA 64 80 

ACTACAGAGA AAGCAACTGT TGAGTATAAC CCTGATCAAC ATGATGTCCA AGAATTTATT 654 0 

35 AATACGATTC AACATTTAGG TTACGGTGTC GCTGTAGAAA CTGTCGAATT AGACATTACA 6600 

GGTATGACTT GTGCTGCATG CTCAAGCCGT ATTGAAAAAG TGTTAAATAA AATGGACGGC 6660 

GTTCAAAATG CAACGGTCAA TTTAACAACA GAGCAAGCTA AAGTTGACTA TTATCCTGAA 6720 

40 

GAAACAGATG CTGATAAACT TGTCACTCGC ATTCAAAAAT TAGGTTATGA CGCGTCTATT 6780 

AAAGATAACA ATAAAGATCA AACGTCACGC AAAGCTGAAG CGCTACAACA TAAATTGATT 684 0 

AAGCTTATCA TATCAGCAGT ATTATCTTTA CCACTATTAA TGTTAATGTT TGTACATCTT 6 900 

45 

TTCAATATGC ATATACCAGC ACTATTTACG AATCCATGGT TCCAATTTAT TTTAGCTACA 6 960 

CCTGTACAAT TTATTATTGG ATGGCAATTT TATGTAGGTG CTTATAAAAA CTTAAGAAAT 7020 

SO GGTGGCGCCA ATATGGATGT ACTTGTTGCT GTTGGTACAA GTGCAGCATA TTTTTACAGT 7080 

ATTTATGAAA TGGTTCGTTG GCTAAATGGC TCAACAACGC AACCGCATTT ATACTTTGAA 714 0 

ACAAGCGCCG TACTAATTAC CTTAATCTTA TTCGGTAAGT ATTTAGAAGC TAGAGCGAAG 7200 
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TTAAAAGATG GTAATGAAGT GATGATTCCT CTAAATGAAG TACATGTTGG AGATACACTT 7320 

ATCGTTAAAC CAGGTGAAAA GAT AC CTGTT GATGGCAAAA TTATTAAAGG TATGACTGCC 7380 

ATCGACGAAT CTATGTTAAC AGGTGAATCT ATCCCTGTTG AGAAGAATGT TGATGATACT 7440 

GTAATTGGTT CAACGATGAA CAAAAACGGT ACTATTACTA TGACAGCAAC AAAAGTTGGC 7500 

GGGGACACTG CGTTGGCAAA TATTATTAAA GTTGTCGAAG AAGCTCAAAG TTCTAAAGCG 7560 

CCGATTCAAC GATTGGCAGA TATTATTTCT GGTTATTTCG TTCCTATCGT TGTTGGTATC 7620 

GCACTATTAA CATTTATCGT GTGGATTACT TTAGTTACAC CAGGTACATT TGAACCTGCA 76 80 

CTTGTTGCGA GTATTTCCGT TCTCGTCATT GCTTGTCCAT GCGCATTGGG ACTTGCTACA 7740 

CCAACTTCTA TTATGGTAGG TACTGGTCGC GCTGCTGaAA ATGGTATTTT ATTTAAAGGT 7800 

GGCGAGTTTG TTGAACGCAC ACATCAAATT GATACCATCG TTTTAGATAA GACGGGTACC 7860 

20 ATTACAAATG GTCGTCCAGT CGTGACAGAT TATCATGGTG ACAATCAAAC GCTACAACTA 7920 

CTTGCTACTG CTGAAAAAGA TTCTGAACAC CCATTGGCAG AAGCCATTGT CAATTATGCA 7980 

AAAGAAAAGC AATTAATATT AACTGAGACA ACAACATTTA AAGCAGTACC TGGCCATGGT 8040 

ATTGAAGCAA CGATTGATCA TCAC CAT ATA TTGGTTGGTA AC CGTAAATT AATGGCTGAC 8100 

AATGATATTA GCTTGCCTAA GCATATTTCT GATGATTTAA CACATTATGA ACGAGATGGT 8160 

AAAACTGCTA TGCTCATTGC TGTTAATTAT TCATTAACTG GTATCATCGC AGTGGCAGAT 8220 

ACTGTCAAAG ATCATGCCAA AGATGCTATA AAACAATTGC ATGATATGGG CATTGAAGTT 82 80 

GCCATGTTAA CTGGCGATAA TAAAAACACT GCTCAAGCCA TTGCAAAACA AGTAGGCATA 8340 

3S GATACTGTTA TTGCAGATAT TTTACCAGAA GAAAAAGCTG CACAAATTGC GAAACTACAG 84 00 

CAACAAGGTA AGAAGGTTGC GATGGTTGGT GACGGTGTAA ATGATGCACC TGCATTAGTT 8460 

AAAGCTGATA TCGGTATCGC CATTGGTACA GGTACAGAAG TTGCCATTGA AGCAGCTGAT 8520 

ATTACTATTC TTGGTGGCGA CTTGATGCTT ATTCCTAAAG CCATTTATGC AAGTAAAGCA 8580 

ACCATTCGTA ATATTCGTCA AAATCTATTT TGGGCATTCG G CT ATAAT AT TGCCGGTATC 8640 

CCTATAGCTG CATTGGGCTT ACTTGCGCCA TGGGTTGCTG GTGCTGCAAT GGCACTAAGT 8700 

TCAGTAAGTG TTGTCACAAA CGCACTTAGA TTGAAAAAGA TGCGATTAGA ACCACGCCGT 8760 

AAAGATGCCT AGATTCCTTA ATAATGAAGG ATTCGTTGGT GATTCTGAGA TAGGCTAGTG 8820 

ATTGGCTCTA TAATGTCGCG GTTTAyaGTC GGATCTTCGC TCCAACTGCA TATATAGTnA 88 80 

CACTTTTCGC TTGGCGAATT AGTGTATCTT ACCTAATAGc TCCGCCTATT AGGTTCCATC 894 0 

ATTATTATAA ATAATAAGTA CACTACGGtT TACAGTTGGA TCTTCGCTCC AACTGCATAA 90 00 
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GAAATTTTAA ATGTTGAAGG TATGAGCTGT GGTCACTGCA AAAGTGCTGT TGAATCTGCA 9120 

TTAAATAATA TTGACGGTGT CACTTCAGCT GACGTTAACC TTGAAAATGG TCAAGTAAGT 9180 

GTTCAATATG ATGACAGTAA AGTTGCTGTA TCTCAAATGA AAGACGCAAT TGAAGATCAA 924 0 

GGTTACGATG TCGTTTAATT AGGCAATATT CAACGTCATC AACACCAAAT TAAAAAATCG 93 00 

AACTGATGAG AATCCCAACA ATCCAAATTA TCTCATCAGT TCGATTTTTA ATTTACTCGT 93 6 0 

AACCTAGTAT CTC CAGTCTG CAATACATCT AATGTTG CAT CTAATGCATC GACAATTAGA 94 2 0 

TTTTTAACTG CAGCTTCAGT ATAAAACGCA ATATGTGGTG TTAATATGAC ATCTTCCCTG 94 8 0 

75 TCAATCAACG ATTCTAACAA TGGATCGTTC AGTGTTTTGC CCCTTTGATC ACTTGGGAAA 954 0 

AGTTTGCGTT CAAATTCATA CGTATCAAGT GCTGCACCTT TAATCACACC ATTGTCTAAT 9600 

GCGTCTAATA ACGCCTTAGT ATCTACTAAA GAACCTCTCG CACAATTGAC AAATACTGCG 9660 

CCCTTTTTAA AATGTTTAAA TAATTCAGCA TTAAATAGAT AATGATTATA TTTCGTTGCA 9720 

GGTACATGTA ATGTCACGAT ATCAGCACCT TCAACCG CTT CCTCAATCGT ATCTTTGTAA 9780 

TCGACATACG TTGCAATTTT AGCATTAGGA AACGGtCGTA TGCGACCACA TCA CTTTG AT 984 0 

AACCATTGGC AAATATATCG GCTACTACAC GGCCAATTCG ACCTGTACCA ATAACAGCTA 9900 

CTTTTAAATC TTTAATGGAT TTCGATAAAA TAGTAGGTTC CCATCTAAAA TCATGcTCCC 996 0 

GCACTTTCGT TTGAATTTGA TTAAAATGAC GAACCACATT AATAGCCTGG TTCACAGCAA 10 020 

ACTCCGCAAT TGAATTCGGA GAGTATGACG GCACATTTGA CACAATAAAG TTATACTTGT 10080 

TTGCTAACTC CAAATCATAT GTATCAAATC CAGCACTACG TTGTGCGATT TGTTTAATAC 1014 0 

35 CTAGTTCATT TAATCGTTTA TAAACATGCT CTGATAATGG TATTTGTTGT GATAGCGATA 10200 

AGCCATCATA ACCAGCGACA CCTTCAACAT TGTCATCAGT TAATGCTTCT TTAGTAATAT 10260 

CTACOTCAAC ATGATGTTTC TCTGCCCACG CCTTGATATA AGGCATATCT TCATCACGTA 10320 

CACTCATGAT TTTAATTTTT GTCATTTTAA CATCACCCTT AACTTTATTA TTCATATAAA 103 80 

TATGCTAGTT CTGTTAATCT TATTGCAGCT TCGTCTAATT TCTGGTCATC TAAOGCCAAT 10440 

GAAATTCTCA CATAACGATT ACCATTCTCT CCAAATGGTT TCCCTGGAGC AACAAGTATT 10500 

GACTTCTCTT GCACTAAAAA TTGCTCAAAT TGCTCGCTGT CATAACCAGG CGGTGTTTCC 10560 

AACCATACAT ATATGCCACC TTTAGCATGA ACAAATGGCA AATCAGCTTT TGCAAGCATG 10620 

so GCTTCGAATC GGTCACGACG TGTTTTAAAT ACATTGCTTT GTTCTTCTAA AAAATCATCA 106 80 

TAATGATTCA AAG CATATAT TGCGGCATCT TGTAATGCAC CAAACATCCC AGCATTTGTG 10740 

TGCGTTTGGT ACTTTTTCAA AGCTTGAATC ATATCTTTAT TACCAACTGC AAAACCGACT 10800 
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CCATTTTCCG AAGCAAGTAT ACTAGGATTT TTAGCGTCGA AACCGAAAGC ACCATAAGCA 10920 

AAATCATGCA CGATTTTAGT GTCTGTACCT TTAAATTTAG cTATCGCTTC ATCAAAAACT 109 80 

5 TCTTTCGTAG CTGTCGATCC AGTTGGATTA TTTGGATACG TTAAATAAAT GAGTTTTGTT 11040 

TTATCTATTA TTTGTGAATC AACTTTGGAC CAATCTGGCA AATAATGTGG CGGTTCTAAA 11100 

TTAAGCGGGA CTGGCTTGCC AT CAG CT AAA AGTACACCTG CTAAATAATC CGTGTAGCCT 11160 

70 

GGATCAGGTA GTAATACATA GTCTCCTGGA TTGATAACAC ATGTTGGTAC TGCCACTAAT 11220 

CCATTTTTTG TACCATATAA AATGCATACT TCATCTTCTT TATCTAACGT CACATTATAT 11280 

T5 TGTCTTTGAT AAAAATCTAC AATAGCTTGC TTGAACGCTT CTTTACCATG AAAAGCACCA 11340 

TATTTTTGAT TTTCAGGAAT AGTTAGTGCT TTTTGAAAAT GATCAATAAT ACCTTGTGGC 11400 

GTGGGCCCAT CAGGGATTCC AACTG C CAT A TTAATTAATG GCAATGGTCC ATGTTCGATT 11460 

20 TTACGTCCCA TCGTTTTCCC GAAATAACTA TCAGGGATAT TTGCTAATTT GTTAGAGAT C 11520 

ATCAAATTCC TCCTCTATCA TTAAACATAG CCTGGGCGAC TATCATAATC CTAACAACTT 11580 

GTATCACTCT CATTTAGATG GTTACAATGA CATCGCCATT CACCGTTATG TTCAACAGAA 11640 

CTTATGACAC ACGTTGTATT GAATGAATTT ATTTTCATTT TAGGTAGGTA TAATATTATT 11700 

GTCAATATTA GGAATTTTCA GATTAATATG CACTCAATCG TTATGATTTA ACTGTCATGC 11760 

ATATCCGCAT GCGCAAGCAG TTAGATATGC TTATATAAAG TATAACGCCC ATCAAGGTAC 11820 

30 

GTATTCAAAC GTGAACCTTA ACAGGCGTCA TTCATTGTTA AATAAAACTT CTTAAGCACA 11880 

TACTTATTTC ACTATGCCTT TTACGTTCCC CTTATACTTT TCTCACATCT TTCTCTTAGA 11940 

35 CTACTCCCTT ATACGCCCCG CTCAATATCT TTAATCATTT CATCTACAGT TATTTTCGCA 12000 
CTCGTTAAGA CAATAGGAAC GCCTGCACCT GGATGCGTAC TTGCACCTGC AAAATATAAA 12060 
TCTTTATAAT CTCGCGATAC ATTTTGTGGA CGATAATAAT TACTTTGCGC TAAAGTTGGC 12120 
ATTAAAC CG A ATGCCGAACC AAATTTCGCA TGATACGTTT GCTCAAAATC ATTTGGCGTA 12180 
AAGATTGTTT CTGAAACAAT ATGCGATTTT ATATCTTCAA ATACTTCAAT CGTTGCTAAT 12240 
TTACGATAAA TAATTTCCTT TATTTGTTGC GTCAAAGCTT CATCTGACCA ATCGATTCCG 12300 
CTACCTGTTT TAAGTTCCGG CGTCGGCATT AGCACATAAA T AC CAGTTTT GCCTTCTGGC 12360 
GCAAGTGATT TATCAGCGAC CGCTGGTACA TACACATAAA TAGAAGGATC ATATGATAAA 124 20 

CGTCCCTCAA ATATTTCTTC AATATTGCCT CTAAAGTCAT CTGAAAAAAT AACATTATGA 12480 
AGTCTCACTT GATCTGTCAC ATCAATATCT ATACCGATAT ACATTAAAAA TGCTGAACAA 12 540 
GAGTAATCTA AGTCTGCAAT TTTATGTGGT GGATACTTTT TAATAGGTGC AAAATCTGGC 12600 
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ATGTCACCAT TCACTTTTAT CGCATCGGCC CGTTTGAATT TAGGATCAAT AATAATTTGC 12 720 

TCAATTTCAG CATTTAGTTC AATATTAACG CCTAAGTCTT TATTTAATTG CGCTAGcCCT 12780 

TGAGCCATGC CATACATACC GCCTTTAATA AAATGCACAC CAAACATCAT TTCAATCATA 12840 

GGAATAATTG AATATAGTGA CGGGCCTCGT TTTGGATCAA TTCCTATGTA TAACGTTTGA 12 900 

AACGCTAAAA GCTTTTGTAT CTTTTCGTTA TCAATATAAT GTTCAATTAG CTGATCTGCA 12 960 

TGATTTAACG TTTTTAACTT AGCACCTTGC ACAAGTGACG TCATATTATA AAAGTCACTC 13020 

GGTTTGCGAT ACGTTCTTTC TAAGAAATAG CGACGTGCAA TTTCATATTT TTTATAAACA 13080 

75 TCCGTTAAAA AGGACATAAA ACCATGCGTT GAACCAGGTT CTATACTTTC TAGCATTTGC 1314 0 

TGTAATTCAG CTAAATCTGT AGGCACCGTT ATACGATCAT CGTGGTCAAA ATACACATCG 13200 

TAAATATAAC GTAATTGTCT CAATTCAATA TAATCTTCAT AATTTTTACC ACACG CTGT A 13260 

AAAACATCTT TATAAACATC TGGCATCATG ACAATTGTGG GACCCATATC AAATGTAAAG 13320 

CCGTCTTTCT TTAATTGATT CATACGCCCG CCTACATTAT TATTTTTTTC AAATATCGTC 13380 

ACTTCATGAC CTTGAGAAGC AATACGGGCT GCCGCTGCTA ATCCTGTGAC ACCTGCACCA 1344 0 

ATTACTGCAA TCTTCATTAT TCAACCACCT ATATTCTATG ATATTTACTA TTTATTTCAT 13500- 

GAAACAACTT TGCCTTTTTC CTCTTATCCA CAAAAACACG TTCATGTAAT GTATAGTTAG 13560v 

CCTGTCTCAC TTCGTC CAGT ATTTCAATAT AT ATACGTG C TGCTAATTCT ATGATTGGTT 13620 

GTGCTTCAAT ACTAAATACT TTGATTTGAT CCATAACATC TTGAAAATCT TTTTCTGCGA 13680 

TAGCTGCATA AT ATT CC CAT AAGTCAATAT AATGATTATT AACACCATTT TGGTACACTT 13740 * 

35 CAGCAATATC AACTTCATAT TGCTTTAATC GTTGCTTACT AAAATATATC CGTTCATTGT 13800 

CAAAATCTTC ACCGACATCT CTTAATATAT TAAnGGGATC CTCTAGAGTC GACCTG 13856 
(2) INFORMATION FOR SEQ ID NO: 32: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10088 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
so ATATATAAAT ATAGATTAAG TATATAGATT AATCAACTTT TTTGGAAGAG CAAATCACGC 60 
AATCAACAAA TAATATAAGA AGTTTTTGCG ATAGTTTTAA AAT AG CTGT A ATAGAATACT 120 
AAATGTGACA AACTTAGAAC TAATATCAAG TGTTGATGTT TTGAATATAA AAATGCTAAT 180 
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ATAATTGGTT AATATATGAG TAATTAGAAA ATAGACAAAG GATGACGATT TATGTATATC 3 00 

AATATGAAAG ATTATGGGTT AACAGGCATA AACAAAACTA AAGATACTCG AGCAATACAA 360 

CGTGCGTTAA ATCGTGGAAG ATGTAAACCA ACGACAGTTT ATATACCGAA AGGGACGTAT 420 

GATATTTGCA AACCATTAAC GATATATGGC AATACAACAC TTTTGTTAGA TAATGAAACT 4 80 

ATTTTACG CC GATGTCATTC TGGTCCTTTA TTAAAAAATG GTCGTCGCTT TGGTTTTTaT 54 0 

CGTGGTTATA ATGGACACAG T CAT ATT CAT ATTAAAGGCG GCAAGTTTGA TATGAATGGT 600 

GTATCGTATC CTTATAACAA TACAGCTATG TGCATTGGGC ATGCTGAAGA TATTCAATTA 66 0 

ATAGGTGTGA CCATTAAGAA TGTAGTGAGT GGTCATGCAA TTGATGCTTG TGGGATTAAC 720 

GGACTCTATA TTAAAAGCTG TTCATTTGAA GGATTCATAG ACTATAGTGG CGAACcTTTT 780 
ATTCTGAAGC AATACAATTA GACATTCAAG TACCTGGTGC TTTTCCAAAA TTCGGAACgA ' 840 



TGC CCGAAAT GGGAAGTTGG AATCGTGCTA TTGGCTCACA TGCAAGTAGA CATAATCGAT 960 

ACTATGAGAA TATTCATATT AGAAATAATA TATTTGAAGA TATACAAGGT TATGCATTAA 1020 

CTCCCTTGaA GTATAAAGAT GCTTTCATTA TTAATAATAA GTTTATTAAC TGTGaGGGTG 1080 

GCATTAGATA TTTAGGAGTT AGAGATGGTA AAAATGCAGC AGATGTGaTG ACAGGaAAAG 1140 

ACTTAGGTTC CCAAGCAGGC ATAAATATGA ATATAATTGG AAATGAATTT AAAGGATCAA 12 00 

TGTCTAAAGA TGCGATACAT GTACGTAATT ATAATAATGT TAAACATAAA GATGTATTAA 12 60 

TCGTTGGGAA TACATTCAAT AATTCGACTC AATCAATTCA TTTAGAAGAT ATTGATACAG 1320 

35 TGTTTTTAAG TCCTGTTGAA GCGGGTATTC AAGTTACTAC AATCAATGTA GATGAAATAA 1380 

AAAAGTAAAA AGTTTCGCAT GACATTAGGA TTAAGAATAG TAGATAATTT TTGAAAGCGC 1440 

ATTGATAAAA CGGTATAAAT ATGCTATAAT AAACCCAATT ATCTGATAAA AGGGGTATTT 1500 

40 TGACGGTAAT GATAATACAA GATAGACAAC TTTCTATACT CTAATATAGT GAGTTGAAGT 1560 

AG CTTGTCAT AATCATCATG AGGGGGAAAT TTATGGCTTA TTTCAATCAA CATCAATCAA 1620 

TGATATCGAA AAGGTATTTA ACATTCTTTT CAAAATCAAA GAAAAAGAAA CCGTTTAGTG 1680 

CGGGACAACT TATTGGACTA ATATTAGGTC CATTACTTTT CCTATTAACA TTATTATTCT 174 0 

TTCATCCACA AGACTTACCT TGGAAAGGCG TCTATGTTTT AGCGATTACT TTATGGATTG 1800 

CGACTTGGTG GATTACTGAA GCAATTCCTA TTGCAGCAAC GAGCTTATTA CCAATTGTGT 1860 

TATTACCATT AGGTCATATA CTTACACCAG AACAAGTATC ATCCGAATAT GGCAATGATA 1920 

TTATCTTTTT GTTTTTAGGT GGATTTATTT TGGCAATTGC AATGGAAAGA TGGAATTTAC 1980 
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TTGGATTCAT 


GGTGGCAACA 


GGATTCTTAT 


CTATGTTTGT 


ATCGAACACT 


GCAGCTGTAA 


1 1 no 




TGATTATGAT 


TCCGATTGGT 


TTAGCAATTA 


TTAAGGAAGC 


ACATGATTTA 


CAAGAAGCGA 


2160 


5 


ATACGAATCA 


AACAAGTATT 


CAAAAGTTTG 


AAAAATCTCT 


AGTTTTAGCA 


ATTGGCTATG 


2220 




CAGGTACGAT 


TGGTGGCTTG 


GGTACATTAA 


TCGGAACCCC 


GCCATTAATT 


ATTTTAAAAG 


2280 


10 


GACAATACAT 


GCAACATTTT 


GGACATGAAA 


TTAGTTTTGC 


TAAATGGATG 


ATTGTAGGGA 


2340 


TTCCAACGGT 


CATTGTTTTG 


TTAGGTATTA 


CTTGGCTCTA 


TTTAAGATAT 


GTTGCGTTTA 


2400 




GACATGATTT 


GAAATATTTa 


CCTGGTGGTC 


AGACGTTAAT 


TAAACAAAAG 


TTAGACGAGC 


2460 




TTGGCAAAAT 


GAAGTATGAA 


GAAAAGGTAG 


TACAAACTAT 


CTTTGTACTT GCTAGCTTAT 


2520 




TATGGATTAC 


AAGAGAGTTT 


CTTCTGAAAA 


AATGGGAAGT 


TACGTCATCT 


GTTGCAGATG 


2580 




GTACGATTGC 


TATTUTTATA 


TCAATATTAT 


TATTTATTAT 


TCCAGCTAAA 


AATACTGAAA 


264 0 


20 


AACAT CGCCG 


TATCATTGAC 


TGGGAAGTTG 


CAAAAGAGCT 


CCCTTGGGGT 


GTATTAATTT 


2700 




TATTTGGTGG 


CGGTTTAGCA 


TTAGCGAAAG 


GTATTTCTGA 


AAGTGGTTTA 


GCAAAATGGT 


2760 




TAGGCGAACA 


GTTGAAATCA 


TTAAATGGTG 


TTAGTCCGAT 


TCTTATTGTA 


ATTGTCATAA 


2820 


25 


CAATCTTTGT 


CTTATTTTTA 


ACTGAAGTGA 


CATCTAATAC 


TGCAACTGCA 


ACGATGATTT 


2880 




TACCGATTTT 


AGCAACGTTG 


TCTGTTGCTG 


TTGGAGTGCA 


TCCATTACTA 


CTTATGGCAC 


294 0 


30 


CTGCAGCTAT 


GGCGGCTAAC 


TGTGCATACA 


TGTTACCAGT 


AGGGACACCA 


CCGAATGCAA 


3000 




TTATCTTTGG 


TTCTGGTAAA 


ATATCTATCA 


AACAAATGGC 


ATCAGTAGGA 


TTCTGGGTAA 


3060 




ACTTAATCAG 


TGCAATAATT 


ATTATTTTAG 


TCGTGTATTA 


TGTAATGCCT 


ATAGTTTTAG 


3120 


35 


GTATTGATAT 


AAATCAACCA 


CTGCCATTGA 


AATAGTAATT 


GCAGATTAGA 


ACGAAAAATA 


3180 




AAAGGTTACA 


TTAGCAATTG 


CTTGGACGAG 


TGGTAACGAA 


ACGTATACCG 


CAGCATCGTG 


3240 




TAASAACAAT 


ACAAACAAAA 


GAAAGTCAAC 


CAAGGATGGA 


TTCCTATTTT 


AATCCTTGGT 


3300 


40 


TGACTCTTTA TTTTATTTAA ATTGTAGAAC CTAGAAAATA . AAGTTTAATT 


AAAAGCACCA 


3360 




ATCATTTCTA 


CTTTGAAATC 


TAAGGTTTCT 


AAAATAGCAA 


TGACTTTCTT 


TATATCGGTT 


3420 


45 


GTAATTGCAG 


AATCAGCCTG 


AACGAAAAAT 


CGATACATAC 


CTAATTGTGT 


TTTTAAAGGA 


3480 


CGAGACTCAA 


TCCAGGATAA 


ATTAATATTA 


AACAAAGCAA ATGTATTAAG 


CACACTTGCT 


3540 




AACAACC CAG 


GTTTATCATG 


CATTGGTGTA 


ATTAAAAACA 


TCAATGATGT 


CGCATTTTGA 


3600 


SO 


TCAAATTGCT 


GCTGATTTTT 


TATAACTAAA 


AAACGTGTCA 


CGTTATGTGG 


ATAGTCTTCA 


3660 




ATATGTGTAT 


CAATAGGTGT 


AAAACCATAA 


GctTCGCCAC 


TACCTAAAGG 


TGCAATTGCT 


3720 




GCAACGCCAT 


TTTCAATTTT 


AGTCAAACTT 


TGAATTGTAC 


TGTCGACATA 


ATCATAGTCA 


3780 
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TTTTTAATAT CAGAAATGGA ATCTGTTCCA TT AC CAT ATA ATGCAAAGTT AATATCTAAA 3 900 

CGTATTT CAC CGTGTGCAAA GACATCTTGC TGTGCAAGTG CATCTGCCAC AATGTTGATT 3960 

GTTCCTTCTA TAGAATTTTC AATAGGGACA ACACCAATCG ATGTGTCATC ATCTGCAACT 4 020 

GCCTTGATGA CTTCAAATAA ATTTGACTTT GGTTGAAAAG TTGCTTCATT TTCAGAAAAA 4080 

TACTGACGAC AAGCCAAATA TGAAAATGTA CCTTTAGGGC CTAAATAATA TAATTG CAT A 414 0 

TGCTACACCT CTACTAACTT AATGATGGAA AGGGCACTGG TTAGCATTTG ATTCTTTCTT 4200 

TTTATAGAAA AAGTTTGGAT CTTTTACTGT ATTGTCATAT CCGTGATGAT. AATTTGACGT 426 0 

75 CAATGTTGGA GATAATGGCG GTGCTAGCCA AGACCATTTT CCGGTAACTT GACGACCTTG 4320 

TTGTGCTTCG TTACGTTCGA ATAGTTCGAA TTGCTTTGCA GCGGTCAAAT GATCGACAAT 438 0 

TGATACGCCT TCTTTTTTAA AGGAATGATA CACAGCATAG TTCAATTCAA CAAGTGCTCG 4440 

<~>r\ 

ATCTTTATTA AATGAATTAT TTTTAAGTGT ATCAAATTCA AACGCATCTG CAACTTTTTC 4 500 

TAGTAAATTG TAACGGTAAT CATCAATAAA GTTACGTACG CCAATTTCAG TTACCATATA 456 0 

CCAACCGTTA AAGGGTGCAG TTGGATATAC AATGCCACCG ATTTTTAAGT • C CAT ATTGG A 4 620 

AATGATAGGG ACTGCATACC ATTTTAAGTT CAATTTTCTT AATTTTGGAT AATGATTATG 4680 

TTCAATAGGT ACTTCTTTAA TTAATGAAGT AGGATATTCG TAAAATTTAA CTGACTCATT 474 0 

AGGTAATTGG TAAATCAGTG GTAACACGTC AAAATTAGTA CCTTTTCCTT TCCAACCTAA 4 800 

GTGATTTGCT AAGCGTGTAA CTTCTTTTTC AGCAGGATCA CCACAATTGT CATAGCCAGC 4 860 

ATAGCGAATT AATTGATTGT TGAAAATTTT AGGTCCATCC TTTGGAGCAT ATATAGTAAT 4920 

35 ATACGGCTTT AATTTACCTT CATTTGTAGC CTGTGTAATA TGATAAGTAA TTGATGATAA 4980 

GAACGATGCT TCGTCAGTAA CATCTCTTGC ATCAATGACA TTTAACGAAT CCCAAAATAA 5040 

ACGACCAATG CAACGATTTG AATTACGCCA AGCCATTTTA GCACCATAAA TAAGTTCTTC 5100 

TTCTGTATGT GTATATGTCC CAGTTTCTTT TATTTCTAGT TCAATGTCAT GTAAACGTTT 5160 

ATTGATAATT TGCGTTTCAT AATGACACTC TTTATACATG TTTTCTATGA AAGCTTGAGC 5220 

CTCTTTAAAT AACATTAACA ACACCTCGCT TTATATTATA GTCTACATTA TTAAAATACT 5280 

CTTAAAAATT ATGTATATGT CATTAAATTG TTGGTTGATT TTAATTAAAA GTATGGAAAT 534 0 

TAAGGGGCTC TTATGTATAT AAAAAAATGA ATTATGATAA AATGTAAGAA AATATTTAGG 54 0 0 

SO TCGATTGGAG AGATACAAGT GTACCAATTA GAAGACGACA GTTTAATGTT ACATAATGAC 54 6 0 

TTATATCAAA TAAATATGGC TGAAAGTTAT TGGAATGATA ATATTCATGA AAAAATGGCT 5 520 

GTATTTGATT TGTATTTTAG AAAAATGCCA TTTAATAGTG GCTATGCTGT TTTTAATGGT 558 0 
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TTAAAGTCTA 


TTGGCTACAA 


GGATGATTTC 


TTATCATATT 


TAAAAGATTT 


AAAATTCACA 


5700 




GGCAGCATCC 


GTTCGATGCA 


AGAAGGCGAA 


TTATGCTTTG 


GTAACGAACC 


ATTGTTACGC 


5760 


s 


GTAGAAGCAC 


CATTGATTCA 


AGCGCAATTA 


ATAGAAACAA 


TTTTATTAAA 


CATTGTAAAT 


5820 




TTCCATACAT 


TAATTACAAC 


AAAGGCTAGC 


AGAATTCGTC 


AAATTGCATC 


AAATGATAAA 


5880 


10 


TTAATGGAGT 


TTGGTACACG 


TCGTGCGCAA 


GAAATTGATG 


CAGCATTGTG 


GGGCGCTAGA 


5940 




GCTGCTTACA 


TCGGGGGCTT 


TGATTCTACA 


AGTAATGTTA 


GGGCGGGGAA 


ATTATTTGGT 


6000 




ATACCTGTGT 


CTGGTACACA 


TG CACATGCA 


TTTGTCCAAA 


CTTATGGAGA 


CGAATATGTT 


6060 


15 


GCCTTCAAAA 


AATATGCTGA 


AAGACATAAA 


AATTGTGTGT 


TCCTAGTAGA 


TACATTCCAT 


6120 




ACTTTAAAAT 


CTGGCGTGCC 


AAATGCAATA 


AAAGTTGCAA 


AAGAATTAGG 


TGACAAAATT 


6180 




AACTTTGTAG 


GTATTCGATT 


AGATTCTGGA 


GATATCG CTT 


ATTTATCTAA 


AGAGGCAAGA 


6240 


20 


CGTATGCTTG 


ATGAAGCAGG 


ATTTACTGAA 


ACTAAAATTA 


TCGCGTCTAA 


TGATTTGGAT 


6300 




GAAGAAACGA 


TTACGAGTTT 


GAAAGCACAA 


GGTGCAAAAG 


TAGATTCTTG 


GGGCGTTGGT 


6360 


25 


ACAAAGCTGA 


TTACAGGATA 


CGATCAACCA 


GCATTAGGTG 


CAGTATATAA 


ACTTGTAGCT 


6420 




ATTGAAAATG 


AAGATGGTTC 


ATATAGTGAT 


CGTATTAAAT 


TATCAAATAA 


CGCTGAAAAG 


6480 




GTTACGACGC 


CAGGTAAGAA 


AAATGTATAT 


CGCATTATAA 


ACAAGAAAAC 


AGGTAAGGGA 


654-lD 


30 


GAAGGCGATT 


ATATTACTTT 


GGAAAATGAA 


AATCCATACG 


ATGAACAACC 


TTTAAAATTA 


6600 




TTCCATCCAG 


TGCATACTTA 


TAAAATGAAA 


TTTATAAAAT 


CTTTCGAAGC 


CATTGATTTG 


6660 




CATCATAATA 


TTTATGAAAA 


TGGTAAATTA 


GTATATCAAA 


TGCCAACAGA 


AGATGAATCA 


6720 


35 


CGTGAATATT 


TAGCACTAGG 


ATTACAATCT 


ATTTGGGATG 


AAAATAAGCG 


TTTCCTGAAT 


67&0 




CCACAAGAAT 


ATCCAGTCGA 


TTTAAGCAAG 


GCATGTTGGG 


ATAATAAACA 


TAAACGTATT 


6840 


40 


TTTGAAGTTG 


CGGAACACGT 


TAAGGAGATG 


GAAGAAGATA 


ATGAGTAAAT 


TACAAGACGT 


6900 


TATTGTACAA 


GAAATGAAAG 


TGAAAAAGCG 


TATCGATAGT 


GCTGAAGAAA 


TTATGGAATT 


6960 




AAAGCAATTT 


ATAAAAAATT 


ATGTACAATC 


ACATTCATTT 


ATAAAATCTT 


TAGTGTTAGG 


7020 


45 


TATTTCAGGA 


GGACAGGATT 


CTACATTAGT 


TGGAAAACTA 


GTACAAATGT 


CTGTTAACGA 


7080 




ATTACGTGAA 


GAAGGCATTG 


ATTGTACGTT 


TATTGCAGTT 


AAATTACCTT 


ATGGAGTTCA 


7140 




AAAAGATGCT 


GATGAAGTTG 


AGCAAGCTTT 


GCGATTCATT 


GAACCAGATG 


AAATAGTAAC 


7200 


50 


AGTCAATATT 


AAGCCTGCAG 


TTGATCAAAG 


TGTGCAATCA 


TTAAAAGAAG 


CCGGTATTGT 


7260 




TCTTACAGAT 


TTCCAAAAAG 


GAAATGAAAA 


AGCGCGTGAA 


CGTATGAAAG 


TACAATTTTC 


7320 




AATTGCTTCA 


AACCGACAAG 


GTATTGTAGT 


AGGAACAGAT 


CATTCAGCTG 


AAAATATAAC 


7380 
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TAAACGACAA GGTCGTCAAT TATTAGCGTA TCTTGGTGCG CCAAAGGAAT TATATGAAAA 7500 

AACGCCAACT GCTGATTTAG AAGATGATAA ACCACAGCTT CCAGATGAAG ATGCATTAGG 7560 

TGTAACTTAT GAGGCGATTG ATAATTATTT AGAAGGTAAG CCAGTTACGC CAGAAGAACA 7620 

AAAAGTAATT GAAAATCATT ATATACGAAA TGCACACAAA CGTGAACTTG CATATACAAG 76 80 

ATACACGTGG CCAAAATCCT AATTTAATTT TTTCTTCTAA CGTGTGACTT AAATTAAATA 774 0 

TGAGTTAGAA TTAATAACAT TAAACCACAT TCAGCTAGAC TACTTCAGTG TATAAATTGA 7 8 00 

AAGTGTATGA ACTAAAGTAA GTATGTTCAT TTGAGAATAA ATTTTTATTT ATGACAAATT 7860 

CGCTATTTAT TTATGAGAGT TTTCGTACTA TATTATATTA ATATGCATTC ATTAAGGTTA 7920 

GGTTGAAGCA GTTTGGTATT TAAAGTGTAA TTGAAAGAGA GTGGGGCGCC TTATGTCATT 7980 

CGTAACAGAA AATCCATGGT TAATGGTACT AACTATATTT ATCATTAACG TTTGTTATGT 8040 

AACGTTTTTA ACGATGCGAA CAATTTTAAC GTTGAAAGGT TATCGTTATA TTGCTGCATC 8100 

AGTTAGTTTT TTAGAAGTAT TAGTTTATAT CGTTGGTTTA GGTTTGGTTA TGTCTAATTT 8160 

AGACCATATT CAAAATATTA TTGCCTACGC ATTTGGTTTT TCAATAGGTA TCATTGTTGG 8220 

TATGAAAATA GAAGAAAAAC TGGCATTAGG TTATACAGTT GTAAATGTAA CTTCAGCAGA 82 80 

ATATGAGTTA GATTTACCGA ATGAACTTCG AAATTTAGGA TATGGCGTTA CGCACTATGC 834 0 

TGCGTTTGGT AGAGATGGTA GTCGTATGGT GATGCAAATT TTAACACCAA GAAAATATGA 8400 

ACGTAAATTG ATGGATACGA TAAAAAATTT AGATCCGAAA GCATTTATCA TTGCGTATGA 8460 

ACCTCGAAAC ATACATGGTG GATT CTGGAC TAAAGGCATT CGTCGTAGAA AGCTTAAAGA 8520 

TTATGAACCA GAAGAACTGG AAaGTGTAGT AGAaCATGAA aTTCmAAGTA AaTGAGAaTG 8580 

AAmCAATtGC TGATTGTTTG TCACGAATGA AAtGCAAGGG TATATGCCGG TAAAACGTAT 8640 

TGAAAAACCC GTGTTTCAAG AGCAAAAAGA TGGCACGGTT GAAGTATCAC ATCAAGAAAT 8700 

CGTTTTTGTA GGTAAGAAAA TCCAATAACA TAATCCAATT TAAATAAAGA CTATTTGAAG 8760 

AGGAAAGGCT ATTCAAAGTT TGAGTAATTT TACTTTGAAT AGCCTATTTG TTTATACATG 8820 

CAAGATGCTC GATCCATATT GTATGAGAAA CCCCCAGCAA GCTATATAAA GCATATGCTG 8880 

GGGGTTCTTA ATATTTTAAA AATTATTGTT AGATTATATA TATCGTCGCT TTTTCTAAAA 8940 

CAATCTCATC GCATGAAATT TTTTCTTCCT AGAGACCTTT AATAAGATTA ATAGTTTACT 9000 

TAATCATATC TAGATAGTCT TATGACTTAT GCTTAATGAA AGTCATTCTA GGAGAAGTTC 9060 

CCAAAGCTTC TGTGTTCATA ATTGTTAGTA GTATTTTATT ATCATTTGGT ATAAATATTT 9120 

CAATAACAAT TGAGCTATTA TTTTTATTAT ATAATGTGAG TTGTTTGTGT TCTGTATTTA 9180 
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CATTTAAATC TTGAGGATGC CATTCTCCCT CAATAATATT AAGATAATAC TTAGCCTCTG 9300 

AATTACATTT GAATTTATCA ATACTAAATA ATTCAATTTG TTCCATAATA TTATTTACCT 9360 

TTCTAAAATA CAAATTTTAA TAACCATAAA TAGATGAATA CCATCGATAA TGGTCGCCAT 9420 

TGGATACTGG AATAACATTG TTTTTAGCAT CTTGAGTCAT AAAACCATTA TCCCATGGAT 9480 

TCCATATAAT TATAACCTCT TGTCCATTAT CTAATTTAGC GTTCCCAACA ACTGCCATGG 954 0 

CATGCCCTGC GTGCATACCA TTTCTTGATT CTACT CTACT ACCTAAAACA GCAATTCCTT 9600 

TATTATTTTT AGTAAGATTG TCAACTTCAT TATATGTAGT CATTCTATTA AGAAGTTGTG 9660 

GACTTCTTCC CTGAGTTTGT CCAAAATAAA TCATCTCTCT TGGCGTTAAA CCAGTAAATT 9720 

GGAATCGTTG TCCTTGTAAG TTTGGGTGTA AAAATCTCAT CACAGCTTCT GCATGATATT 9780 

TGTTAGTATT ATAAGTCGCA TTTAGTAATT CAGACATCGT ATAGCCTGCA CACCAACCAT 984 0 

20 TGTTACCTTG AGTTTCTCTT ATCTTGAAAT TCTCAAGTTT ATTTATATAT TGsTCGTTGT 990 0 

AAGTATAATT ATTACTTTTA AATTGACTAG TTGGCATAGT GACAGAAGCT TTTTGCTTTA 996 0 

GTTGCGTTAC ATTATTGCCA GTAGGTATAC TCTCAGTCTT TnTnAACTnT nTATCTTCTA 10020 

25 GACGTGGTGT TTTTAGTACT AGTTTAG CTT TATGATTTTG AGTACCACAT AGTAACCTTT 10080 

TGAGTTGT 10088 
(2) INFORMATION FOR SEQ ID NO; 33; 

30 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7563 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
3S (D) TOPOLOGY: linear 



T (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 

40 CGGAAACGnA CCCnATGCGT ATGCTTGACG TGCCAAAATT AAATACGAAG TTCATAGCTT 60 

TGAGGTACCA GAAGAACATT TATCTGGTCA AGAAGTCGCA GnACTCATAC AAGCAAATGT 120 

TAAAACAGTA TTTAAAACGC TTGTTCTAGA AAATACAAAA CATGAACATT TTGTATTTGT 180 

45 

TATCCCAGTA AGTGAAACTT TAGATATGAA AAAGGCAGCT GCTTTGGTTG GAGAGAAGAA 24 0 

ATTGCAGCTT ATGCCTTTAG ATAATTTGAA AAATGTAACG GGATACATTC GTGGTGGGTG 3 00 

TTCGCCTGTT GGTATGAAAA CATTGTTTCC AACAGTCGTT GACAAATCGT GTQAAAATTA 360 

SO 

TAGTCATATC AGTGTGAGTG GTGGGCTTCG AACAATGCAA ATCACAATAG CTGTTGAGGA 4 20 

TTTGATTACA ATAACTAAAG GCAAAATTGG AGCAGTTATC CATGAATGAT TAATAACAAC 4 80 
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TGCCACACTC 


CTTTTTGATT 


GAATTAGCAT 


TTTACGATCA 


TAAACAGTCA 


TTATAATTGA 


600 




GTATTTGAAC ATAAAAATGT 


AATTTTATCG 


TAACAATTTG 


AGTGTTTGTG 


ATTGTTTTTG 


660 


5 


GTAATTTATG ATTGAAAAGT 


GAAAGCGTAC 


TCATTATAAT 


ACAAAGTGAG 


ATGGGGTGAT 


720 




GATGATAATT 


ACTGaAAAAA 


GACACGAGTT 


AATATTAGAA 


GAACTTTCGC 


ACAAAGATTT 


780 


10 


TTTGACTTTA 


CAAGAATTAA 


TAGATCGAAC 


TGGTTG CAGT 


GCTTCAACAA 


TACGArGAGA 


840 


TTTATCTAAA 


CTACAACAAT 


TAGGGAAATT 


GCAACGTGTG 


CATGGTGGTG 


CAATGTTAAA 


900 




AGAAAATCGT 


ATGGTTGAGG 


CGAATTTAAC 


TGAAAAATTA 


GCAACGAATC 


TTGATGAAAA 


960 


15 


GAAAATGATT 


GCTAAAATAG 


CAGCTAATCA 


AATCAACGAT 


AATGAATGCT 


TATTTATCGA 


1020 




TGCTGGTTCA TCTACATTGG 


AGCTAATTAA 


ATATATTCAA 


GCGAAAGATA 


TCATTGTGGT 


1080 




AACCAATGGT 


TTAACACATG 


TAGAAGCTTT 


ACTTAAAAAA 


GGTATTAAAA 


CAATTATGCT 


1140 


20 


AGGTGGTCAA 


GTTAAAGAAA 


ATACACTTGC 


TACGATTGGT 


TCTAGTGCTA 


TGGAGATATT 


1200 




AAGACGATAT 


TGTTTCGATA 


aagcttttat 


CGGGATGAAT 


GGATTAGATA 


TTGAACTTGG 


1260 




ATTAACTACT 


CCCGATGAGC 


AAGAGGCATT 


AGTTAAACAA 


ACAGCAATGT 


CATTAG CCAA 


1320 


25 


TCAATCATTT 


GTACTTATAG 


ATCATTCTAA 


GTTTAATAAA 


GTATATTTTG 


CTCGTGTACC 


1380 




TTTGCTAGAA 


AGTACGACAA 


TCATCACATC 


TGAAAAAGCA 


TTAAATCAAG 


AATCGTTAAA 


1440 


30 


AGAATACCAA 


CAAAAGTATC 


A CTTT AT AGG 


AGGGACTTTA 


TGATTTATAC 


AGTGACTTTC 


1500 


AATCCTTCAA 


TTGACTATGT 


CATTTTTACG 


AATGATTTTA 


AAATTGATGG 


TTTGAACAGA 


1560 




GCAACAGCAA 


CATATAAATT 


CGCTGGGGGG 


AAAGGTATTA 


ATGTCTCGCG 


CGTCTTAAAG 


1620 


35 


ACATTGGATG 


TTGAGTCAAC 


TGCCTTGGGA 


TTTGCAGGTG 


GATTTCCTGG 


GAAATTCATT 


1680 




ATAGATACAT 


TAAATAACAG 


TGCAATTCAA 


TCGAATTTTA 


TTGAAGTTGA 


TGAAGATACA 


1740 




CGTATTAATG 


TGAAATTAAA 


AACAGGACAA 


GAAACAGAAA 


TCAATGCACC 


GGGTCCTCAT 


1800 


40 


ATAACGTCAA 


CACAATTTGA 


ACAACTGTTA 


CAACAAATTA 


AAAATACAAC 


AAGCGAAGAT 


1860 




ATAGTTATTG 


TTGCTGGAAG 


TGTACCAAGT 


AGTATTCCAA 


GCGATGCGTA 


TGCGCAAATT 


1920 




GCACAAATTA 


CAGCACAGAC 


AGGTGCTAAA 


TTAGTAGTCG 


ACGCTGAAAA 


AGAATTGGCT 


1980 


45 


GAAAgCGTTT TACCATATCA 


TCCACTATTT 


ATTAAACCTA 


ATAAAGATGA 


ATTAGAAGTG 


2040 




ATGTTTAATA 


CAACAGTGAA 


CTCAGACACA 


GATGTTATTA 


AATATGGTCG 


TTTGTTAGTT 


2100 


SO 


GATAAAGGTG 


CGCAATCTGT 


TATTGTCTCG 


CTTGGCGGTG 


ATGGTG CT AT 


TTATATTGAT 


2160 


AAAGAAATCA 


GTATTAAAGC 


AGTTAATCCA 


CAAGGGAAAG 


TGGTTAATAC 


AGTTGGCTCT 


2220 




GGTGATAGTA 


CAGTTGCAGG 


CATGGTGGCT 


GGAATTGCTT 


CAGGTTTAAC 


GATTGAAAAA 


2260 
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CGGGACGCTA TAGAAAAAAT AAAATCACAA GTTACGATTA GCGTACTTGA TGGGGAGTGA ~ 24 00 

AAATAATGAG AGTAACAGAG TTATTAACAA AAGATACAAT AGCAATGGAT TTAATGGCAA 24 60 

ATGACAAAAA TGGTGTTATT GATGAGTTAG TAAATCAATT AGACAAAGCA GGTAAATTAA 2520 

GTGATGTCGC GTCATTTAAG GAAGCGATTC ACAATCGAGA ATCACAAAGT ACAACTGGTA 2580 

TCGGCGAAGG TATTGCCATT CCACATGCCA AAGTGGCCGC AGTTAAGTCA CCAGCTATTG 264 0 

CGTTTGGTAA ATCTAAAGCA GGCGTAGATT ATCAAAGTTT GGATATGCAA CCAGCACACT 2700 

TATTCTTTAT GATTGcAGcG CCAGAAGGTG GCGCCCAAAC ACATCTAGAT GCTTTAGCTA 2760 

AGTTGTCTGG TATTTTAATG GATGAAAATG TACGTGAGAA ATTATTACAT GCTTCATCAC 2820 

CTGAAGAAGT ACTAGCGATC ATAGATGAGG CTGATGATGA AGTGACAAAA GAAGAAGAGG 2880 

CAGAAGCTGA AGCACAACAA GTTGCAACTG CAGAACAATC ATCTAAACAA TCTAATGAGC 2940 

20 CATATGTGTT AGCAGTAACT GCTTGTCCAA CAGGTATTGC ACACACATAT ATGGCACGTG 3000 

ATGCATTGAA AAAGCAAGCG GATAAAATGG GTATTAAAAT TAAAGTAGAA ACGAATGGTT 3060 

CAAGCGGCAT TAAAAACCAT TTAACTGAAC AAGATATTGA AAATGCAACA GGTATCATTG 3120 

TTGCTGCTGA TGTTCATGTT GAGACGGATC GCTTCGATGG TAAAAATGTC GTAGAAGTAC 3180 

CAGTAGCAGA TGGTATTAAA CGCCCAGAAG AATTAATTAA TAAAGCATTA GATACAAGTC 324 0 

GTAAACCTTT TGTTGCCCGT GATGGTCAAA GAAAAGGTAA CTCAAATGAC AGTCAAGAAA 3300 

AATTAAGCCC AGGTAAAGCA TTCTATAAAC ACTTAATGAA CGGTGTTTCT AACATGTTGC 3360 

CACTTGTAAT AT CTGGTGGT ATTTTAATGG CAATTGTATT TTTATTTGGA GCAAATTCAT 3420 

TTAATCCAAA AAGCTCAGAG TACAATGCGT TTGCAGAGCA GCTTTGGAAC ATTGGTAGTA 3480 

AAAGTGCATT CGCGTTAATC ATTCCAATTT TATCTGGATT CATTGCACGT AGTATTGCGG 3540 

ATAAACCTGG TTTCGCTTCA GGTCTTGTAG GTGGTATGTT AGCAATTTCA GGTGGTTCAG 3600 

40 GATTTATTGG TGGTATTATT GCAGGTTTCT TAGCAGGTTA CTTAACACAA GGTGTTAAAG 3660 

CCATGACACG TAAGTTACCA CAAGCATTAG AGGGATTAAA GCCAACATTA ATTTATCCAC 3720 

TATTAACAGT GACGGCTACA GGCTTATTGA TGATTTATGC CTTTAATCCA CCAGCATCTT 3780 

GGTTAAATCA TTTGTTATTA GATGGATTAA ACAATTTATC AGGTTCTAAT ATTGTATTAT 3840 

TAGGTTTAGT TATTGGCGCT ATGATGGCGA TTGATATGGG CGGTCCATTC AACAAAGCGG 3900 

CATATGTTTT TGCAACAGGT GCGTTGATTG AAGGTAATGC AGCACCAATT ACAGCTGCAA 3960 

TGATTGGTGG TATGATTCCA CCGTTAGCAA TTG CGACAGC GATGTTAATT TTTAGACGTA 4 020 

AATTTACAAA AGAACAACGT GGTTCAATTA TCC CTAACTA TGTGATGGGT ATGTCATTTA 4080 
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TGATTGGTTC AGGTATAGGT GGCGCAATTG CTTTAGGCTT AGGTTCACGA ATTACTGCGC 42 00 

CACATGGTGG TATTATTGTA ATTGTTGGTA CTGATGGTGC ACACTTACTT CAAACTCTTA 42 60 

TTGCACTTCT AGTTGGCACA TTAGTTTCAG CATTAATTTA CGGTTTAATC AAACCAAAGT 4320 

TAACTGAAAC AGAAATCGAA GCTTCAAAAT CAATGGACGA GTAGTTTTAA TGATGTAAAA 4330 

TGATTG TT AG CAAAGAGCTT CATATTAAGT TGTATGTTCA ATGAATATAT GTTAGTTTTA 4440 

TATATCGTGT TAACGGTAGC TTATACAAAG CTGTAAAAAC ACTTTCTATT AATTCAGTTT 4 500 

TTATGAATTG ATATGAAAGT GTTTTTATTT TTAGATAAAT GAATGAAGAA ATAGACACCA 4 560 

CAAATGTATA GACTTTTTTA ATATTTTGCA AAAAGTTATG CCAAACGAAG CAGATATAGT 4 620 

AAAATATGAG TGTCTTAAAG TGAAAATTTA TAAATAAAGA AGGGTTTATA CGTGTCAGAA 4680 

TTAATTATAT ATAACGG CAA AGTTTATACT GAAGATGGCA AAATCGATAA TGGTTACATT 4740 

CATGTGAAAG ATGGACAGAT TGTTGCAATT GGAGAAGTGG ATGATAAAGC AG CAATTG AT 4 8 00 

AATGATACGA CAAATAAAAT TCAAGTGATT GATGCTAAAG GTCATCATGT ATTACCAGGT 4860 

TTTATTGATA TACATATTCA TGGTGGTTAT GGTCAAGATG CAATGGATGG GTCATACGAT 4 920 

GGCTTAAAAT ATCTATCCGA AAATTTGTTG TCTGAAGGGA CGACATCATA CTTGGCCACT 4980 

ACAATGACGC AATCGACTGA TAAAATAGAT AATGCACTTA CAAATATTGC TAAATATGAA 504 0 

GCGGAgCAAG ATGTTCACAA TGCAGCGGAA ATTGTAGGTA TACATTTAGA AGGACCATTT 5100 

ATATCTGAAA ATAAAGTTGG TGCTCAACAT CCGCAATACG TTGTACGCCC ATTTATCGAT 5160 

AAAATTAAAC ATTTTCAAGA GACTGCTAAC GGATTAATAA AGATTATGAC GTTTGCACCT 5220 

GAAATTGAAG GTGCAAAAGA AG CG CTTGAA ACGTATAAAG ATGACATTAT TTTTTCAATT 5280 

GGTCATACAG TAGCAACATA CGAAGAAGCA GTTGAAGCTG TTGAGCGAGG AGCTAAACAT 5340 

GTCACGCATT TATATAATGC AGCGACGCCA TTCCAACATA GAGAACCAGG TGTTTTTGGA 54 00 

40 GCAGCATGGT TGAATGATGC TCTACATACC GAAATGATTG TTGATGGCAC TCATTCTCAT 54 60 

CCGGCATCGG TTGCAATTGC TTAC CGTATG AAAGGTAATG AACGTTTTTA TTTAATTACC 5520 

GATGCAATGC GTGCAAAAGG TATGC CTGAA GGAGAATATG ATTTGGGTGG ACAAAAAGTA 5580 

ACTGTTCAAT CGCAACAAGC ACGTCTTGCA AATGGTGCGC TTGCTGGTAG TATTTTAAAA 564 0 

ATGAATCATG GGTTACGTAA CTTAATATCA TTTACAGGTG ATACATTAGA TCATTTATGG 57 00 

CGAGTAACAA GTTTAAATCA AGCCATTGCA TTAGGTATCG ATGATAGAAA AGGTAGTATT 5760 

AAAGTAAATA AGGATGCAGA TCTTGTTATT CTAGATGATG ATATGAATGT AAAATCTACA 5 820 

ATAAAACAAG GCAAGGTTCA CACATTT AG C TAATAAATAA TCATAATTAA ATGTATGCAA 58 80 
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TTTTCTGGGG GTGTCTAAAT GGGAAGGCGA TAACATGTAG TTGTAATTTA AGTCATAGTG 6000 

ATAAATTTGA ATGCGTGTTA CCCATGAGTG ACACATATAA CATGGAGGTG AATCCCTAGA 606 0 

AATAGGGAAT TAATTGGAAA CTTCGACCAT AATTAGTTTG ATTATATTTA TTCTATTAAT 6120 

TGCATTAACC ACTGTATTTG TTGGTTCAGA ATTTGCATTA GTAAAAATTA GAGCAACAAG 6180 

AATTGAACAG CTAGCAGATG AAGGAAATAA AC CTG CT AAA ATAGTAAAAA AGATGATTGC 624 0 

TAATCTAGAT T ATT AT CTTT CTGCTTGTCA GTTAGGTATA ACAGTAACAT CTTTAGGGTT 6300 

AGGTTGGCTT GGTGAACCAA CGTTTGAAAA GCTATTACAC CCAATATTTG AAGCAATCAA 63 6 0 

TTTACCAACT GCATTAACGA CGACGATTTC GTTTGCAGTG TCATTTATAA TCGTTACGTA 6420 

TTTGCATGTA GTACTTGGTG AATTAGCGCC TAAATCTATA GCTATTCAAC ATACTGAAAA 64 80 

GCTTGCTTTA GTATATGCAA GACCATTGTT CTATTTCGGT AACATTATGA AACCATTGAT 654 0 

20 TTGGCTGATG AATGGTTCTG CACGTGTTAT TATTAGAATG TTTGGTGTAA ATCCTGATGC 6600 

CCAAACTGAT GCAATGTCAG AAGAAGAAAT CAAAATTATT ATTAACAATA GTTATAATGG 6660 

TGGAGAAATC AACCAAACTG AATTGGCATA TATGCAAAAT ATCTTTTCAT TCGATGAAAG 6720 

ACATGCAAAA GATATAATGG TACCTAGAAC TCAAATGATT ACACTAAATG AACCTTTTAA 6780 

TGTAGACGAA TTACTAGAAA CAATAAAAGA ACATCAATTT ACGCGTTATC CAATTACTGA 684 0 

TGATGGTGAT AAAGACCACA TTAAAGGATT TATTAACGTC AAAGAATTTT TAACTGAATA 6 900 

CGCTTCTGGA AAAACGATTA AAATAGCAAA CTATATaCAT GAGTTGCCAA TGATTTCAGA 696 0 

GACAACACGT ATCAGTGATG CATTAATTAG AATGCAACGT GAACATGTAC ATATGAGTCT 7020 

TATTATAGAT GAATATGGTG GAACGGCAGG TATTTTAACG ATGGAAGATA TTTTAGAAGA 7080 

AATCGTTGGA GAAATTCGTG ATGAATTTGA TGATGATGAA GTGAATGATA TCGTTAAAAT 714 0 

TGATRATAAG ACATTCCAAG TAAATGGCAG AGTACTATTG GATGATTTAA CTGAAGAGTT 7200 

40 CGGTATAGAA TTTGATGACT CTGAGGATAT TGATACGATA GGTGGATGGT TACAATCTCG 7260 

TAATACCAAT TTACAAAAAG ATGATTACGT GGATACAACT TATGATCGCT GGGTTGTTTC 7320 

AGAAATCGAT AACCACCAAA TTATTTGGGT GATATTAAAC TATGAATTTA ATGAAGCGAG 73 80 

ACCTACTATC GGACAGTCTG ATGAAGATGA AAAATCAGAA TAGATATTAA TATATAAACC 744 0 

AACTAAGAAT GATTTAATTC ATTTTTGGTT GGTTATTTTT TTGACTAAAA TTAAnGAAAA 7500 

GTGAAAATAG TATTGGAACT CAATATCTTT AATGATTTAA TGAATAAnTT TTATTGAAAG 7560 

CGA 7563 
(2) INFORMATION FOR SEQ ID NO: 34: 
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(A) LENGTH: 34 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

TTATATCAAC TTCATGG CGG AACCATTGAT GACCCATTAG ACGAAACAAT AAGCGCATTT 6 0 

SATGAATTGA AACAAGAAGG AATTATACGT GCTTACGGTA TTTCTTCTAT TCGCCCAAAT 120 

GTAATTGATT ATTATTTAAA ACATAGTCAA ATCGAAACGA TAATGTCTCA ATTCAATTTG 180 

15 ATTGATAATC GTCCAGAATC ATTATTAGAT GCAATTCACA ACAATGATGT TAAAGTATTG 24 0 

GCAAGAGGAC CTGTGTCTAA AGGATTATTA ACTTCAAACA GTGTTAATGT GCTCGACAAT 300 

AAATTTAAAG ATGGTATTTT TGATTATTCT CATGATGAAT TGGGTGAAAC AATAGCCTCT 360 

ATTAAAGAAA TTGAAAGTAA TTTATCTGCA TTGACATTTA GTTATTTAAC ATCACATGAC 42 0 

GTGCTTGGTT C CAT CATTGT AGGTGCAAGT AGCGTCGACC AATTAAAAGA AAATATTGAA 4 80 

AACT AT CAT A CTAAAGTTAG TTTAGATCAG ATTAAAACAG CAAGAGCTCG TGTAAAGGAT 540 

TTGGAATATA CCAATCATTT AGTGTAGAAG TCATTTTCAG TAATAAAAAC AGCAG CATGA 600 

GGCGTTTCAT TATAAAAATG CCTTACTGCT GTTGTTTATG TACAATT CGC TATAATTTAT 660 

GATTATGATT ACTCACTTAT GATAGAAATT AAAGCGTTGT CCTCACGCAT CAGTATTTAG 720 

TAATTTCGCC TTGCGGCATT GCCTTAAGCA AACTTCTGCC ACTTCATCTC TTAATAATTT 780 

TATTAAAACA TCTTTCTATA TTT CACTTCG CATGTTGATT CATCATTATT AGTTATTATT 84 0 

3$ TGTACACCCA GCACATTTCC TTGCAACACA AGTAGTTTGA ATTTTTCACA AGTATAATAT 900 

AATGTACCGT CTGAAATTTG GTCTACAGAA ATATCGCCTA AAATATCCAG CACTGTAAAT 960 

TCTT CAAATA CTGATAGTTG TTC CGCATAT CGTACACAAA GTCTTACCAC ACTCTCCGAT 1020 

TGACAGTTCA TTGCCATCCC ACCTATTTAT GCTTTATTTT TAAATAATTT AGGGAAACAT 1080 

CGTTCAAAAA ATCTAGGCGC AATTTGATAC ATTTTCAACG CATGaTGCAT CCATTTAGGC 1140 

CGATTAATTT CCAATTGTTT TGTTTTAATG CCATAAATGA TATCTTCTGC AAGCTGATTA 1200 

GCATCAAGCA TAATTTCCCC CATCTTTTTA gCATACTTCA TTGATGGGTC GGCTTTTTGA 1260 

TGAAAAGGTG TATCAATCGG GCCAACATTA ACTGTCATGA TATGTAAGTT TGGTGACTCT 1320 

AGTCTTAAAG CATTCATTAA TGCATAAAAC CCTGCTTTCG ATGCCCCATA ATGTGCAGCA 1380 

TTTGCTTGTG TGGAAAATGC AG CTTGACTT GAAATACCTA CAATATGTGC GTTAGATGTT 144 0 

AAATATGGTC TCAACACAGT ATATAAAACA TTAAAACTAA TTAAATTAAG CTGATACGTT 1500 
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TAAATGAATC CATOGAATGA TGTATTGTCT TCAAATTGCA GTGCCTGTAT CGACTTCAAA 1620 

TCATTTAAGT CACAAGGAAT AACATTTATA GTTTTCCCCA ATTCCTGTTC AAAGATTCTA 1680 

GTTGCTTTAT CAACATCACG CACCAACAAC GTTACATGCA CTTTATTTTC TAGTAACTTT 1740 

CGGACAATCG ATAAACCTAA ACCACTCGTA CCACCAGTCA CTATAAAATG TTGTCCTTTC 1800 

ATCAATTAAC CTTCCTTTTC AATTATATAG AATGCAATTT ATCAACTTTA CATAATTGAG 1860 

ACAAGTTGAT TATCTTTCCT AATATATATA CAATAATAAG AAAATATAAC ATACAAATCA 1920 

AAAACTAAAG GGATGTGaCG TTAATG r AAC TCGTATTTTA TGGAGCTGGT AATATGGCAC 198 0 

,5 AAGCTATATT TACAGGrATT ATTAACTCmA GCAACTTAGA TGCCAATGAT ATATATTTAA 204 0 

CAAATAAATC TAATGAACAA GCTTTAAAAG CATTCGCTGA AAAACTAGGT GfTTAACTATA 2100 

GTTATGAtGA TGCGACATTA TTAAAAGATG CAGAyTATGT ATTTTTAGGT ACCAAACCAC 2160 

20 ATGACTTTGA TGCTCTAGCA ACACGCATCA AACCACATAT TACAAAAGwC AATTGCTTCA 2220 

TTTCAATTAT GGCAGGTATT CCGATTGATT ATATTAAACA ACAATTAGAA TGCCAAAATC 2280 

CaGTTGCTAG AATTATGCCA AACACAAATG CGCAAGTTGG ACACTCTGTT ACTGGCATTA 2340 

GTTTTTCAAA CAACTTTGAC CCTAAATCTA AAGATGAAAT TAACGATTTA GTTAAAGCAT 2400 

TTGGTTCTGT AATTGAAGTA TCAGAAGATC ATTTACATCA AGTAACAGCT AT CACCGGAA 2460- 

GCGGCCCAGC ATTTTTATAT CATGTATTCO AGCAATATGT TAAAGCTGGT aCsAAACTTG 2520 

GTCTAGAAAA AGAACAAGTT GAAGAATCTA TACGCAACCT TATTATAGGT ACAAGTAAGA 2580- 

TGATTGAACG TTCAGAtTTG AGCATGGCTC AATTAAGAAA AAATATTACC TCTAAAGGTG 2640- 

GTACGACACA AGCTGGCCTT GATACATTGT CACAATATGA TTTAGTATCT ATTTTCGAAG 2700. 

ATTGTCTAAA CGCTGCCGTC GACCGTAGTA TTGAACTTTC TAATATAGAA GACCAATAAA 2760 

AACA5ACCCG CCAACACATG TATGCATCAT CGCAAGCACT GTGTTTGACG GGTTATTTTT 2820 

40 ATAATTTATT GTTATTTGGC AAGCATTGTT TATTACTTTG TCATTAGATT TTAAAACTAT 2880 

CAAAATCTTT TACAAAATTA AAATTAGGTG TATCTTCATT TTGTATCAAT GTTTGATAAA 2940 

TTTCATTTAT ATCTTCTGTA TTATAGCGAT TGCTCAAATG TGTAATCAAC GTACGTTTAA 3000 

CATTGGCTTC TTTTATCAAT GCAAATACGT CTTCAATATG GCTATGATGA TAATTGTTGG 3060 

CTAAATGCTT TTCACCATCT ATATAGGTCG CTTCATGTAC CATCACATCA GCATCTCTAG 3120 

AAATCACACG TTCATTAGAA CATGGTTTTG TATCACCAAA AATTGCTACA ACTGGACCCT 3180 

GTTTGGACTC ACCTCTAAAA TCTTTTGATT GATAAACTTG ACCATTATGT TCAAATGTAT 3240 

CATGAGATTT TACTTCTTGA TATTTAGGAC CTGGTTCAAG ACCAATGTTT TTTAACGCTT 3 300 
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CATGATTAAG TAAATGCGCC TCTACAGTAA AACCATCCAT GATGATATGT CAGATGATCA 3420 

TCGATTTCAA TATATGtAAT TGGATAGTTT AAATGTGACT CTGATAAATT CATAGACATT 34 80 

5 TCCACATATG CT 3492 

(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 1973 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

1$ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

ATCTAGCGGT ACAAGCGTCT TGGAGGCTAG TATGTTGAAC ATTGTAAACC CTGAAGATCA 60 

20 CTTCGTTGTC ATTGTTTCAG GTGCCTTTGG TAACCGATTT AAACAAATTG CACAAACTTA 120 

TTACAAAAAT GTGCATATTT ATGACGTAAC ATGGGGAGAA GCTGTAGATG TCAAAGATTT 180 

CATCAATTTC CTTTCAACTT TAAATGTTGA AGTTAAAGCA GTATTTAGTC AATATTGCGA 240 

25 AACATCTACG ACAGTGCTAC ACCCTATTCA CGAGTTAGGA AATGCCATTA ATCAATTTAA 300 

TAGTAATATT TATTTTGTAG TTGACGGCGT AAGTtGCATT GGTGCTGTTG ATGTTGACAT 360 

TAACAAAGAT AAAATTGATG TACTTGTTTC TGGTAGTCAA AAAGCAATTA TGTTACCTCC 420 

AGGATTAGCT TTTGTAGCTT ATAGCCACCG TGCAAAAGAA CATTTCAAAG AAGTAACTAC 4 80 

GCCAAAATTT TATCTAGACT TAAATAAATA CATTTCGTCA CAAGCTGACA ATTCTACACC 54 0 

GTTCACACCA AATGTGTCTT TATTTAGAGG TGTAAATGCA TACGTTGAAA CCGTAAAAGC 600 

AGAAGG TTTC AATCACGTAA TAGCACGACA CTATGCAATT AGAAATGCAT TAAGAAGCGC 660 

CTTAAAAGCA TTAGATTTAA CTTTATTAGT CAATGATAAA GATGCATCTC CAACGGTTAC 720 

AGCATTCAAA CCTAATACAA ATGATGAAGT GAAAATAATC mAAGATGAAC TTAAAAATnG 780 

CTTTAAAATA ACAATTGCnG GTGGTCAAGG CCATCTTAAA GGTCAAATTT TnAGAATTGG 840 

TCATATGGGG AAAATTAGTC CTTTCGATAT TTTATCGGTA GTATCTGCTT TAGAAATTAT 900 

45 TTTAACTGAA CACCGTAAAG TTAACTATAT CGGTAAAGGT ATATCAAAAT ATATGGAGGT 960 

TATTCATGAA GCAATTTAAT GTACTCGTTG CAGATCCCAT ATCAAAAGAT GGTATCAAAG 1020 

CATTATTAGA TCACGAACAA TTCAATGTAG ATATTCAAAC TGGCTTGTCC GAAGAAGCAT 1080 

S ° TAATCAAAAT TATACCTTCA TACCATGCTT TAATCGTTCG TAGTCAAACT ACGGTTACTG 114 0 
AAAATATCAT AAATGCTGCT GATTCTTTAA AAGTAATCGC ACGCGCCGGT GTTGGTGTAG 
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GTAATACGAT 


TTCAGCTACT 


GAACATACAC 


TGGCAATGTT 


ATTATCAATG 


GCACGAAATA 


1320 


TTCCGGAAGC 


ACACCAATCA 


CTTACAAATA 


AAGAATGGAA 


TCGAAATGCA 


TTTAAAGGTA 


1380 


CTGAGCTTTA 


TCATAAAACA 


TTAGGTGTCA 


TTGGTGCTGG 


TAGAATTGGT 


TTAGGTGTTG 


1440 


CTAAACGTGC 


GCAAAGTTTC 


GGAATGAAAA 


TACTAGCTTT 


TGACCCTTAC 


TTAACGGATG 


1500 


AAAAAGCAAA 


ATCTTTAAGC 


ATTACGAAGG 


CAACAGTTGA 


TGAGATTGCC 


CAACATTCTG 


1560 


ATTTCGTTAC 


ATTACATACA 


CCACTAACAC 


CTAAAACAAA 


AGGCTTAATT 


AATGCTGTCT 


1620 


TTTTTGCCAA 


AGCAAAACCT 


AGTTTGCAAA 


TAATCAATGT 


GGCACGTGGT 


GGTATTATTG 


1680 


ATGAAAAGGC 


GCTAATAAAA 
TGAACCTGCA 


GCATTAGACG 


AAGGACAAAT 


TAGTCGGGCA 


GCTATCGATG 


1740 


TCTTTGAACA 


ACTGACTCGC 


CTCTTGTTGC 


ACATGATAAA 


ATTATTGTTA 


1800 


CACCTCATTT 


GGGTGCTTCA 


ACAGTCGAAG 


CTCAAGAAAA 


AGTGGCAATT 


TCTGTTTCAA 


1860 


ATGAAATCAT 


CGAAATTTTA 


ATTGATGGTA 


CTGTAACGCA 


TGCAgTGAAT 


GCACCTAAAA 


1920 


TGGACTTAAG 


CAATATAGAT 


GATACTGTAA 


AATCATTCAT 


CAATTTAAGC 


CAA 


1973 


(2) INFORMATION FOR SEQ ID NO: 36: 








(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7620 base pairs 

(B) TYPE : nucleic acid 
<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 








(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 36: 






GGTGTTTCAG 


ATGTCACTGG 


TTGATrriTA 


ATTGTAGACG 


GGTATTTTGG 


GCTTTCGCCA 


60 


TATTTATTTG 


CCGGCTTACT 


GTCAAAGCAT 


AGGAATACTA 


TCATAACAAT 


TGTTAGGCCT 


120 


AAATfiAACAA AATAAAGAAG 


TACTAACAAA 


ATATTAAGAC 


CCATCGGCAT 


TAATGTAAAA 


180 


TCACTGTCAT AATAACTATC 


GATAATCTGT 


AATACTATAT 


AAAATATAAT 


ACTGAATACT 


240 


GTCATAATCA 


TTGGAAATAA 


CATTGTTCTT 


GATATATCGT 


GAAATCTTCG 


AACGCACAAC 


300 


GCTAAATTTG 


GAATAAACGT 


TGCCAAACTA 


TAGACAAAAG 


TATACACAGA 


TGTAAGGATA 


360 


ATCATCAATA 


TACTCATAAC 


TATTAATGTT 


TCGTTATCCG 


CCGCTATAGA 


AATAAAGAAT 


420 


AGAAATAGGT 


TTATTATTAG 


CACACACACA 


GCTGGAACCA 


TAAGTATCAA 


ATGCCATAGT 


480 


GCCATATACC 


AATATTCACT 


ACGTCTTGAT 


CTCCCCTTAA 


AATTTACATA 


ATTTTTCCAA 


540 


AATAAAACGA 


ATGATTTCAT 


AAAACCTACT 


TGAGGTAATT 


GTTCCATTGT 


AATCTCCCTT 


600 


TCGTTAATCA 


TATTTATATT 


TTTAATTATT 


GTTACCGTTA 


TAATTTACAA 


GATTCATTAT 


660 
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GTAAAATGAA 


AACCCGCTAC 


AAGTACACAT 


CTATATGGAG 


ACTCATTTGA AAGTCAACGC 


780 




TTCGTTAACT 


ATACTAAAAA 


TATGTCATAC 


TGCAATGTTC 


ACGTTTAAAA 


GAGTCTCAAT 


840 


5 


CTATGCAAAT 


AAAATATTCC 


ATAACAAAGT 


ATATACTTTA 


CAT1TTTATA 


ATTCTTAACA 


900 




ATACTATTTT 


ATCAAACATT 


TACCACAATA 


AAAATATCTT 


TTTCATTTTT 


ATTTAAATTA 


960 


10 


ATCATATAAT 


TGCGAGGAGA 


ATATTATGGA 


TTT CGTTAAT 


AATGATACAA 


GACAAATTGC 


1020 


TAAAAACTTA 


TTAGGTGTCA 


AAGTGATTTA 


TGAGGATACC 


ACTCAAACGT 


ATACAGGCTA 


1080 




CATCGTGGAA ACGGAAGCTT 


ACTTAGGTTT 


GAATGATCGT 


GCGGCTCATG 


GCTATGGCGG 


1140 


to 


TAAAATAACA 


CCTAAAGTCA 


CGTCATTATA 


TAAACGTGGT 


GGTACAATTT 


ATGCACATGT 


1200 




CATGCATACG 


CATTTACTCA 


TTAATTTTGT 


AACAAAATCT 


GAAGGTATAC 


CTGAAGGCGT 


1260 




ACTTATCCGC 


GCAATTGAAC 


CAGAAGAAGG 


TTTATCCGCT 


ATGTTCCGTA 


ACAGAGGTAA 


1320 


20 


GAAAGGCTAC 


GAGGTAACGA 


ATGGCCCAGG 


AAAATGGACT 


AAGGCATTTA 


ACATTCCACG 


1380 




GGCTATCGAT 


GGCGCTACGT 


TAAATGACTG 


TAGATTGTCT 


ATTGATACTA 


AGAATCGTAA 


1440 




ATATCCTAAA 


GATATTATTG 


CTAGTCCACG 


AATCGGTATT 


CCAAATAAAG 


GTGATTGGAC 


1500 


25 


ACATAAATCT 


TTACGTTACA 


CAGTGAAAGG 


TAATC CATTT 


GTGTCTCGCA 


TGCGTAAATC 


1560 




AGATTGTATG 


TTTCCCGAAG 


ATACTTGGAA 


ATAAATGCCA 


TCTTTCATTG 


ATTACTATCA 


1620 




TGAAAATGAA ATCTATCTCC 


TTATAAGTCA 


ATCAATCGTG 


CCGTCAACAT 


GCGGATGGGT 


1680 


30 


TGATTGTTTT 


TCTTTGTATC 


CATCATATTT 


TTTGATTCAT 


CTCCTCTTAT 


TGAACTTGTT 


1740 




CTTAATTATA 


AAATATAACA 


ATAGAATTAT 


TTATAATTAT 


TAAATTTAGA 


TGCATTAATA 


1800 


35 


TTATTGATAT 


TATTTTCAAA 


AACTAGAAAT 


ATTGATTTGT 


TGCATGTATA 


ATGTTAAAAG 


I860 


CGCCCTTTTA 


TAACGCTTAC 


ATATAAAAGC 


TTATTTAGGG 


AGAGGGATAT 


TCAACAAGGG 


1920 




GGATTTGAAA 


ATGATAGAAC 


TTAATGCAAT 


TACAACATTA 


TGTTTAGCTT GTATCCTTTA 


1980 


40 


TTTACTTGGT 


AAGGCTATCG 


TTAATCACGT 


TAATTTTTTA 


AAACGTATTT 


GTATACCAGC 


2040 




ACCAGTGATT GGCGGCTTAA 


TCTTTGCTAT 


TTTAGTTGCG 


GCTTTGGATT 


CATTTGGCAT 


2100 




GGTTAAGATT 


AAATTAGATG 


CTTCATTCAT 


TCAAGATTTC 


TTCATGTTAG 


CATTCTTTAC 


2160 


45 


GACAATCGGT 


CTTGGTGCAT 


CATTGAAATT 


ATTTAAATTA 


GGTGGCAAAG 


TCTTGCTATT 


2220 




ATACTTTATG 


TTTTGTGCTA 


TCATTTCAGT 


CATTCAAAAC 


ATAGTTGGTG 


TATCACTAGC 


2280 




AAAAGTATTA 


AATATTAAAC 


CTTTGTTAGG 


ATTAACAGCA 


GGTTCCATGT 


CTATGGAAGG 


2340 


50 


CGGTCATGGT 


AATGCTGCTG 


CTTATGGTAA 


GACAATTCAA 


GATTTAGGTA 


TTGATTCGGC 


2400 




ACTGACAGCG 


GCTCTTGCAG 


CTGCAACTTT 


AGGTCTTGTA 


. TTTGGAGGGC 


TTATCGGTGG 


2460 
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ATTTAAAGAT TATAGCCAAG TAGCATATAA CGAACATTTA CATAGTAAAT TTAATGCCAC 258 0 

TGAAGTATTC TTCATTCAAT TTACAATCGT TGTATTCTGT ATGGCAGTTG GAAGTTATTT 264 0 

CAGTCATTTG TTTACAGCTC AAACAGGGAT TAATGTTCCA ATTTACGTTG GCTCATTATT 2700 

TGTAGCTGTT ATTGTCCGAA ATATCTCTGA AAGTTTTAAT TTTAATATTG TAGATTTAAA 276 0 

AATTACTAAT CAAATTGGCG ATGTCGCATT AGGTATTTTC TTATCTCTTG CGCTAATGAG 2 82 0 

CATTCAATTA ATCGAAATTT ATAAACTTGC TATACCTCTT ATTATTATCG TTTTAGTTCA 28 80 

AGTTGTCGTT ATGATTTTAT TTGCTGTTTT AATTTTATTT AGAGGTTTAG GAAAAGATTA 2 94 0 

,5 TGATGCTGCA GTAATGGTAG GTGGTTTTAT CGGTCATGGG CTTGGTGCAc GCCAAATGCC 3000 

ATGGCAAATT TAGATGTTAT TACTAAAAAA TATGGAAACT CACCTAAAGC ATATTTAGTT 3060 

GTACCTATTG TTGGTGCATT CTTAATCGAT TTAATTGGTG TTATAGTCAT TATGGGATTC 3120 

20 ATACAATGGT TTAGTTAAAC ACCAAACTCA TAAATAAAAG AGGAGGCCTT CGCCTCcTcT 3180 

TTTATTTATC CTCGATGTAT ATTCAAGTTA CGTTGTTCTA TCCATGACAA TATTTCCGGA 3240 

CTAAATACGA TTTGTTTTTG TGTTAAGTCG TCAATATTTT TAGCATCTAA CATCGTCATT 33 00 

ATTGATTTCA TGTGTTCAAT AAATGATTCT ACATAAGCTA CTGTATGTftC AATGCCATTA 33 60 

TTTTCAACTT GATTTAAAAA CGGACGTGAC ATACCAGTTG CCTTTGCACC AAGTGCTAAA 3420 

CTTTTAATTG CATCGAGTGG TGTACGTAAA CCACCACTCG CGAAAACTGA AATTTCGCTT 34 80 

TGATAAGCCG TTGTTTCAAG TAATGACTCA ACTGTAGACT GTCCCCATGA TGATAAGTAA 354 0 

TCCATATCTT TATTTGCACG ACGTTCATTT TCAATATCTA CAAAGTTAGT ACCACCTTTG 3600 

CCACTAACAT CGACATACTT GACGCCTATT TGTTGTAAGT CATGCATTAA TTCTTTGCTC 3660 

ATACCAAATC CAACTTCTTT TATAATGACT GGAAGAGACA CTCGTGATAC AATCGACGCT 3720 

ATA*Ct*ATCTA ACCAAGTCAC AAATTCACGA TTCCCTTCAG GCATAACTAA TTCTTGAGGA 3780 

40 GAATTAACAT GGATTTGTAA CGCTTGTGCC TCAAGTAATT CAACTGCTTC CAAAGCCTTT 3840 

TCTACTGGTA CGTCCGCACC AACATTGCTA AAAATCATGC CTTCAGGATT CATTTTTOGC 3900 

GCAATCGTAA ACGTCTCAGC CATGCGTGGA TTTCTCAATG CCGCATGTGT TGATCCAACT 3 960 

GCCATCGCTA AGCCAGTTTC TCTTGCAACT ACAGCTAGCT TTTCATTGAT GTTTTTCGTC 4 020 

CACTCGCTAC CACCCGTCAT TGCATTAATA TAAACCGGAT ATGCCATCGT TAAGTCAGGC 408 0 

GTCTGTGATG TCAAATCGAT ATCATTTACA TTAATTGATG GGATAGAATG ATGCACAAAA 414 0 

CGCATCTTAT CAAAATCTGA ATGCATTGCG TCAGATTGGG CCATTGCTAT TTCAACATGT 4200 

TCATTTTTTC TCTGTTCTCT TTGAAAATCA CTCATGATTA AACCTACCTT TTCGTCATTT 4 26 0 
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ATTACAGCTA AGCAAATATA ATATCCATAA TGTAAATGTA ATGCCGGCAT ATTTACAAAG 4380 

TTCATACCAT AAATCCCAGC TATGAATGTT AACGGTGAAA ATATAACTGA TACTAATGTC 4440 

AGTACTTGCA TAATACTATT CATTCTAAAT GACGTGTATG ACTCAAAATT TTCTCGTATT 4500 

TCGTTTGTCA TTTCTTGAGC AGTACGAATG ATATTACGTT GCTTAATCAA GTGGTCATCG 4 560 

ATATGTTGAA TGTATAGCGA ATGTTTATTA TCTATAATCA AATCACCATT TTGTTTCATT 4620 

GTATCAATTA G CTCTTG CAT AGGAAACAGT ACACGTTTTA CTTTAATCAA ATCCGAACGT 4680 

AACTTAAAGA CACTATCCAT GACCATTTTA TTAAAGCGAT CATCTACATG GCGGTCTTCA 4 74 0 

75 AAATGATAAA CACTATCTTC AAGTGCATAT ACAAAGTTGA AATATTTATC AACCATCATA 4 800 

TCTAAAATTA ATATGACGAC ATCTGCACAA TCTAATTCTG CATCTAATGT ATTCATATAC 4 860 

TTATAGACTA CTTTATTTAA TGATTCCAAC GTTTGATGAT GATATGTTAC TAATACATTG 4 920 

20 TCTTGTATAA AAATATTTAG TGCTATTGGT GAATAGTTTG ACCCCATAAT ACTATGGAAT 4980 

ACTAAGTATT GATAATCTTT ATAAGATTTA TATTTAGCTC GTGG CAT ACC GTTAATTGCA 504 0 

TCATCCACTT CTAAATCATT AAAATTAAAA TGTGCTTTAA ACCATTCATT TTCTTGTTCA 5100 

TTCGGTTCAT CAAAATCATA CCAAACAATA GTCGCATCTT TTGGTATCTC TTTGATATCA 516 0 

TCAACTACTT TAAACGGTTC ATATGTAGTT TGATACCGTA TCTTTAAAGC CATCGATACT 5220 

CCCCCTAAAT AACGAATTCT CTATTATTTT AT CATGAATT AAATAACGTG TATGTCTTAA 5280 

TTTATTTTAG TATGATAGTC ACTAAGGAGA TGGTTATTAT CAAACAACTT TTTACACATA 534 0 

CTCAAACCGT AACATCTGAA TTCATTGACC ATAACAATCA TATGCATGAT GCAAATTATA 5400 

ATATCATTTT TAGTGACGTC GTGAATCGTT TTAATTACAG CCACGGTCTT TCTTTAAAAG 5460 

AACGCGAAAA TTTAGCATAT ACGCTATTTA CACTAGAAGA ACATACGACA TACCTCTCAG 5520 

AATTGTCTCT TGGCGATGTA TTTACTGTTA CTTTATATAT TTATGATTAC GATT AT AAG C 5580 

40 GGTTGCATTT ATTTTTAACA TTAACTAAAG AAGATGGTAC ACTAGCATCA ACAAATGAAG 5640 

TAATGATGAT GGGAATTAAT CAGCACACAC GTCGTTCTGA TGCTTTTCCT GAATCATTTT 5700 

CAACACAAAT AGCACACTAT TATAAAAATC AATCAACTAT CACTTGGCCT GAACAATTAG 5760 

45 GACATAAAAT AGCAATTCCA CACAAAGGAG CATTAAAATG ACAGATGCAT TACAACAAAA 5820 

GATTCATATC GAATTACTAG ATTTATTAGA TGATGTTAAG TTTGAATTAA CAGAATTAAA 58 8 0 

TGCACAAAAA GGGTTATACA TTAACGGACC AGCAAATCAG CTACTTAAGC GTGGCGTGCA 594 0 

SO 

TATGGCTTAT GTTCAAGGAC AAAAGCAAGC CATCGATAAT ATTATGACTA TTGTGGAACA 6000 

ACAGCTTGAA AGATCAACAT TTCCTAGAAC ATTATGATAA ATTTCAAAAT GAGGTTGCTC 6060 
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ATAATTTTTT AGATCAATTT TATCAAATTA AAGGGCAATA CTTTATCATC ACACATATCA 6180 

ATACACTTAT TGGTGATTTT CACTCAGAAG CTCATTAACA ATTAGTCTAT ATAACCCTTG 624 0 

CTATATTTTC AAAAACAAAA CCCAATTACG TTTTCATGTC AAATATCATC TTGCATGAAA 6300 

TCGTAACTGG GTCATTTATA TGTTATTAGT TATTTTGTGT TACATCCTCA TCTATCGATT 63 60 

TGGCAATTTG TTTAATAGCT TTATGTGATT GTCTAATTGG ATAAATTGGA AAATCATGTA 6420 

CCATCTTAGG AT AAT CAT AA AACTCAATGT ATTGATGATG TTGCAACATC ATTTGTTCAA 64 8 0 

ATAGCTTCAT ATCAGGATGT GTCATTTCAC GTCCACCACC AAACATATAA ACTGGTGGCA 654 0 

1S ATCCTTCTAT TGTG CCATTA ATTGGCGATA TGCGCTTATC TGTTAATGGT AGGCCATTCG 6600 

CCCATTTTTT CATAATCTCA TTGACACCAA ACTGACTTAG aACCGCATCT TGTT CGATT A 6660 

AGGCGTCCGA AATATCTTTA TTAGATAGTG TTGCATCTAA AATTGGTGAG ATTAAATACA 672 0 

20 ATTTATTCGG TAATGGCTGT TGATTAkCTA AAAGAGATTG TACAAAGGAT AATGCCAGTG 6780 

CACCACCTGA ACCATCACCC ATGACTACGA CATTTTGATG TCCTACTTCA GATACTAATT 684 0 

GaTCATAAAC ACGTTGTATC GCTTGGnAAA GTATCGTCaA TATGnAAACT CTGGTGTCTT 6 900 

TGGATAGATA GGCAGTACAA CCTCATATAA TGtACTTAAA GTGATTTTAT CCCAACAATC 6 960 

TCCAATGGAA CGGTGATGGT TGTAGTG CAT TGAATCCACC GTGAATATAT AAAATTTTCT 7020 

TATCAATTTG ATGTCTGAAA TTAAAGCGAA AGACTTGCAT ATCATCTAAT GACAATTTTT 70 80 

CTAAATTTGC TTTAACATTT AATGTTGAAG GCTG CTTATG TTTTTTTCTA TTTTCAATTT 714 0 

CTCTTTTATA AAAAAATCTT TCAACATCTT GATCATTTTT AAACAT AAT C GAGCGATTGT 7200 • 

GAAGCAAATA TTTATTGACA ACGCTATTCA TAACACGGTT TCTAATCAAT GTCTTAACCT 7260 

ACCTTTATAT ATTTTATGTA TCCAATGATk GTCTATCCCC TACATTCTTT GCCAAAAAAA 7320 

GTATATAATG TAGAAGATAT TTTCTTTTTC ACTTTCAAAT TTAAGACTAC AATTGAACAG 73 80 

40 TGATTTTTCA TCATTATAAC AGACAACTAG ACATATTGAT AAGTAAAGAA AAGAACTTTA 7440 

TACGGAGGTA CCTTGCATGA CAAATCCAAA TCAACGATTA GAACCATTTG ATGAGACATT 7500 

TCAACAACCG AATATTCATC GTGGTAAGCG ATATGGTAAG AAAAAACGTT CATTGGTAAG 7560 

45 CATGATTATT CAAATCATTG TTGTwATATT AACCACCATC GCTGGAATAC AGCATGGTGG 762 0 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 
so (A) LENGTH: 9834 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 





GTCATtACCG 


amTTTCtTAG 


AaTCATTTAA 


AGATGATAAA TATACAAACG 


TTGGTAATTT 


60 


5 


AAAAGAAGTG 


AATTTTGATA 


AAATTGCTGC 


GACGAAACCC GAAGTAATCT 


TTATCTCTGG 


120 




ACGTACAGCT 


AATCAAAAGA 


ATTTAGATGA 


ATTCAAAAAA GCTGCACCTA 


AAGCGAAAAT 


180 




TGTTTATGTT 


GGTGCAGATG 


AAAAGAACTT 


AATTGGTTCA ATGAAACAAA 


ACACTGAAAA 


240 


10 


TATCGGAAAA 


ATTTACGATA 


AAGAAGATAA 


AGCTAAAGAA TTAAATAAAG 


ATTTAGATAA 


300 




CAAAATTGCT 


TCAATGAAAG 


ATAAAACGAA 


AAACTTCAAT AAAACTGTTA 


TGTATTTACT 


360 


IS 


AGTTAACGAA 


GGTGAATTAT 


CAACATTTGG 


ACCTAAAGGT CGTTTTGGTG 


GATTAGTTTA 


420 


CGATACATTA 


GGATTCAATG 


CAGTTGATAA 


AAAAGTAAGT AATAGCAATC 


ATGGACAAAA 


480 




TGTTTCTAAC 


GAATATGTTA 


ATAAAGAAAA 


TCCAGATGTT ATTTTAGCGA 


TGGATAGAGG 


540 


20 


TCAAG CGATA 


AGTGGTAAAT 


CAACTGCGAA 


ACAAGCATTA AATAATCCTG 


TATTAAAAAA 


600 




TGTTAAAGCA 


ATTAAAGAAG 


ACAAAGTATA 


TAATTTAGAT C CT AAATT AT 


GGTACTTTGC 


660 




AGCTGGATCA 


ACTACAACTA 


CAATTAAACA 


AATTGAGGAA CTTGATAAAG 


TTGTAAAATA 


720 


25 


ATTTTAAAAG 


AGGGGAACAA 


TGGTTAAAGG 


TCTTAATCAT TGCTCCCCTC 


TTTTCTTTAA 


780 




AAAAGGAAAT 


CTGGGACGTC 


AATCAATGTC 


CTAGACTCTA AAATGTTCTG 


TTGTCAGTCG 


840 




TTGGTTGAAT 


GAACATGTAC 


TTGTAACAAG 


TTCATTTCAA TACTAGTGGG 


CTCCAAACAT 


900 


30 


AGAGAAATTT 


GATTTTCAAT 


TTCTACTGAC 


AATGCAAGTT GGCGGGGCCC 


AAACATAGAG 


960 




AATTTCAAAA 


AGGAATTCTA 


CAGAAGTGGT 


GCTTTATCAT GTCTGACCCA 


CTCCCTATAA 


1020 


35 


TGTTTTGACT 


ATGTTGTTTA 


AATTTCAAAA 


TAAATATGAT AGTGATATTT 


ACAGCGATTG 


1080 


TTAAACCGAG 


ATTGGCAATT 


TGGACAACGC 


TCTACCATCA TATATTCATT 


GATTGTTAAT 


1140 




TCGTQTTTGC 


ATACACCGCA 


TAAGATTGCT 


TTTTCGTTAA ATGAAGGCTC 


AGACCAACGC 


1200 


40 


TTAATGGCGT 


GcrrrrcAAA 


CTCATTATGG 


CACTT AT AG C ATGGATAGTA 


TTTATTACAA 


1260 




CATTTAAATT 


TAATAGCAAT 


AATATCTTCT 


TCGGTAAAAT AATGGCGACA 


scgTGTTTCA 


1320 




GTATCGATTA 


ATGAACCATA 


AACTTTAGGC 


ATAGACAAAG CTCCTTAACT 


TACGATTCCT 


1380 


45 


TTGGATGTTC 


ACCAATAATG 


CGAACTTCAC 


GATTTAATTC AATGCCAAAT 


TTTTCTTTGA 


1440 




CGGTCTTTTG 


TACATAATGA 


ATAAGGTTTT 


CAT AAT CTGT AGCAGTTCCA 


TTGTCTACAT 


1500 




TTAC CAT AAA 


ACCAGCGTGT 


TTGGTTGAAA 


CTTCAACGCC GCCAATACGG 


TGACCTTGCA 


1560 


50 


AATTAGAATC 


TTGTATCAAT 


TTACCTGCAA 


AATGACCAGG CGGTCTTTGG 


AATACACTAC 


1620 




CACATGAAGG 


ATACTCTAAA 


GGTTGTTTAG 


ATTCTCTACG TTCTGTTAAA 


. TCATCCATTT 


1680 
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AGTGTTCTTT TTGAATAATG CTATTACGAT AATCTAACTC TAATTCTTTT GTTGTAAGTT ~ 1800 

TAATTAACGA GCCTTGTTCG TTTACGCAAA GCGCATAGTC TATACAATCT TTAACTTCGC 1860 

CACCATAAGC GCCAGCATTC ATATACACTG CACCACCAAT TGAACCTGGA ATACCACATG 1920 

CAAATTCAAG GCCAGTAAGT GCGTAATCAC GAGCAACACG TGAGACATCA ATAATTGCAG 1980 

CGCCGCTACC GGCTATTATC G CAT CATCAG ATACTTCGAT ATGATCTAGT GATAATAAAC 204 0 

TAATTACAAT ACCGCGAATA CCACCTTCAC GGATAATAAT ATTTGAGCCA TTTCCTAAAT 2100 

ATGTAACAGG AATCTCATTT TGaTAGGCAT ATTTAACAAC TGCTTGTACT TCTTCATTTT 2160 

TAGTAGGGGT AATGTAAAAG TCGGCATTAC CACCTGTTTT AGTATAAGTG TATCGTTTTA 2220 

AAGGTTCATC AACTTTAATT TTTTCATTTG GGATAAGTTG TTGTAAAGCT TGATAGATGT 2280 

CTTTATTTAT CACTTCTCAG TACATCCTTT CTCATGTCTT TAATATCATA TAGTATTATA 234 0 

2Q CCAATTTTAA AATTCATTTG CGAAAATTGA AAAGAAAGTA TTAGAATTAG TATAATTATA 24 00 

AAATACGGCA TTATTGTCGT TATAAGTATT TTTTACATAG TTTTTGAAAG TATTGTTGCT 24 60 

TTTGCATCTC ATATTGTCTA ATTGTTAAGC TATGTTGCAA TATTTGGTGT TTTTTTGTAT 2520 

25 TGAATTGCAA AGCAATATCA TCATTAGTTG ATAAGAGGTA ATCAAGTQCA AGATAAGATT 2580 

CAAATGTTTG GGTATTCATT TGAATGATAT GTAGACGCAC CTGTTGTTTT AGTTCATGAA 264 0 

AATTGTTAAA CTTCGCCATC ATAACTTTCT TAGTATATTT ATGATGCAAA CGATAAAACC 2700 

CTACATAATT TAAGCGTTTT TCATCTAAGG ATGTAATATC ATGCAAATTT TCTACACCTA 2760 

CTAAAATATC TAAAATTGGC TCTGTTGAAT ATTTAAAATG aTGctACCGC CAATATGTTT 2820 

TGTATATTTT ACTGGGCTGT CTAAGAGGTT GAATAATAAT GATTCAATTT CAGTGTATTG 2 880 

TGATTGAAAA CAATTAGTTA AATCACTATT AATGAATGGT TGAACATTTG AATACATGAT 294 0 

AAACTcCTTT GATATTGAAA ATTAATTTAA TCACGATAAA GTCTGGAATA CTATAACATA 3000 

ATTCATTTTC ATAATAAACA TGTTTTTGTA TAATGAATCT GTTAAGGAGT GCAATCATGA 3 060 

AAAAAATTGT TATTATCGCT GTTTTAGCGA TTTTATTTGT AGTAATAAGT GCTTGTGGTA 3120 

ATAAAGAAAA AGAGGCACAA CATCAATTTA CTAAGCAATT TAAAGATGTT GAGCAAAAAC 3180 

45 AAAAAGAATT ACAACATGTC ATGGATAATA TACATTTGAA AGAAATTGAT CATCTAAGTA 324 0 

AAACTGATAC AACTGATAAA AATAGTAAAG AATTTAAGGC ACTACAAGAA GATGTTAAAA 33 00 

ACCATCTCAT ACCTAAATTT GAAGCATATT ATAAGTCAGC AAAAAATTTG CCTGATGATA 33 6 0 

50 CAATGAAAGT TAAGAAATTA AAAAAAGAAT AT ATGACG CT TGCAAATGAG AAGAAGGATG 34 20 

CGATATATCA ATTAAAAAAA TTCATAGGTT TATGTAATCA ATCTATCAAG TATAACGAAG 34 8 0 
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AATTAGCTGA 


TAATAAAAGT 


GAAGCAACTA 


ATCTTACGAC 


AAAATTAGAA 


CATAATAATA 


3600 




AAGCGTTAAG 


AGATACTGCG 


AAGAAGAACC 


TAGATGATAG 


TAAAGAAAAT 


GAAGTAAAAG 


3660 


5 


GCGCGATTAA 


AAATCACATT 


ATGCCAATGA 


TTGAAAAGCA 


AATTACCGAT 


ATTAACCAAA 


3720 




CTAATATTAG 


TGATAAGCAT 


GTTAATAATG 


CAAGGAAAAA 


CGCAATAGAA 


ATGTATTACA 


3780 




GTCTGCAGAA 


CTATTATAAT 


ACACGTATTG 


AAACAATAAA 


GGTTAGTGAG 


AAGTTATCAm 


3840 


10 


AAGTCGATGT 


AGATAAGTTG 


CCGAAAAAGG 


GTATAGATAT 


AACTCACGGC 


GATAAAGC CT 


3900 




TTGAAAAAAA 


GCTTGAAAAA 


TTAGAAGAAA 


AATAACTATA 


ATCATTTTTC 


AAAGTTAAAA 


3960 


15 


ATTTTGAATT 


TATGGTTAAC 


ATGTCAACTT 


ACTATGTGTA 


TAATGGTAAA 


CATTGATATT 


4020 


AACTATATGT 


ATAAAAATGT 


CACGCAGATG 


CTATTTAAAT 


GTGATAAATA 


tttttagagg 


4080 




TGAATAGAGT 


GGCTATAAAG 


CTAAGTTCAA 


TTGACCAATT 


TGAACAGGTT 


ATTGAGGAAA 


4140 


20 


ATAAATATGT 


TTTTGTATTA 


AAACATAGTG 


AAACTTGTCC 


AATATCGGCA 


AATGCGTACG 


4200 




ATCAATTTAA 


TAAA'i w m"lA 


TATGAACGCG 


ATATGGACGG 


TTATTATTTG 


ATTGTCCAAC 


4260 




AAGAACGCGA 


TTTGTCAGAT 


TATATTGCTA 


AAAAAACGAA 


CGTTAAACAT 


GAATCACCTC 


4320 


25 


AAGCATTTTA 


TTTTGTAAAT 


GGTGAAATGG 


TTTGGAATCG 


AGACCACGGT 


GATATCAATG 


4380 




TGTCGTCATT 


AGCACAAGCA 


GAAGAATAAT 


GAAACTATAG 


GGTTGGAACA 


TTTTGCCTTA 


4440 


30 


CACTACTAGA 


CGTGAATAGC 


ACAACTTAAA 


TTCGTGTGAA 


TCAGAGTAGT 


TTGGCTATAA 


4500 
4560 


TGATGTTCTG 


ACCT'riTATT 


TTATGTCACC 


TTTAGAAGCA 


GTTAAGTTAG 


TACTITmA 




CAAACATATG 


TATAATATAT 


TCGAGTATTT 


TTATTGAAAa 


tATTTTGGAA 


AACGACGAAT 


4620 


35 


CCAATAAGAA 


AATTTAAACA 


TGATTTGTAA 


GTTAGTTTAA 


TAGGAAATAT 


ATGCTAAACC 


4680 


AAAAGAAGCA 


TATTGTTATT 


TACTGGAATA 


ATTAATAATC 


ATGTCATGTT 


AAATGTTAGC 


4740 




ATATAATCAC 


GAGATAAAAT 


CTAAAATTTA 


AGATTAATCT 


TTTATGAATA 


AAAAACGTAT 


4800 


40 


CACAACAAAT 


AATAAAGTAA 


GGTGGTCAAG 


GTTATGAAAG 


TATTAGTAGC 


CATGGATGAG 


4860 




TTTCATGGAA 


TTATTTCAAG 


TTATCAAGCT 


AATAGATATG 


TTGAAGAGGC 


AGTTGCAAGC 


4920 




CAAATTGAAA 


CTGCAGATGT 


AGTTCAAGTA 


CCATTGTTTA 


ATGGAAGACA 


TGAATTATTA 


4980 


45 


GATTCTGTAT 


TTTTATGGcm 


ATCTGGGcaA 


AAGTATCGTA 


TACCAGTACA 


TGATGCAGAT 


5040 




ATGAATGAAG 


TTGAAGGTGT 


TTACGGACAA 


ACTG AT ACAG 


GGATGACCGT 


TATCGAGGGG 


5100 




AATTTATTTT 


TAAAAGGTAA 


AAAACCAATT 


GTTGAACGAA 


CAAGTTATGG 


TTTAGGAGAA 


5160 


SO 


ATGATTAAAC 


ATGCATTAGA 


TAACGACGCA 


. AAACATGTTG 


TAATTTCACT 


AGGTGGGATT 


5220 




gatagttttg 


ATGCTGGTGC 


AGGTATGTTA 


> CAAGCATTAG 


GTGCTCAATT 


CTATGATGAC 


5280 
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GATATGTCGA ACTTACACCC TAAAATGGAA ACAGCAAGAA TTCAAGTAAT GTCGGATTTT 54 00 

TCAAGTCGAT TATATGGTAA GCAAAGTGAA ATCATGCAAA CTTATGATGC GCATCAGTTG 54 60 

AATCATAATC AAGCAGCAGA AATCGATAAT TTAATTTGGT ATTTTAGTGA GTTATTTAAA 5520 

AGTGAATTGA AAATTGCAAT TGGTCCAGTT GAACGTGGTG GTGCTGGTGG TGGAATTGCA 55 80 

GCAGTCTTGA ATGGACTGTA TCAAGCTGAA ATATTAACCA GTCATGCATT AGTAGACCAA 564 0 

CTAACACATT TAGAAAATTT AGTTGAACAA GCGGATTTAA TTATTTTTGG AGAAGGATTA 5700 

AATGAAAATG ATCAGTTGCT AGAAACGACA ACATTGCGTA TTGCAGAACT TTGTCATAAA 5760 

CATCAAAAGG TTGCCATTGC AATTTGTGCA ACTGCTGAAA AGTTTGATTT ATTTGAATCA 5820 

CAAGGGGTTA CAGCAATGTT TAATACATTT ATCGATATGC CAGAAACTTA TACTGACTTT 5880 

AAAATGGGtT ACAAATTAGG CATTATACGG TTCAGTCTTT AAAACTGTTG AAAACACATT 5940 

TTAATGTTGA GGTTTAGTAA AGAAGGACTA AATTGGTGAT GCTGTCATGA TGGTTAATAA 6 000 

CATTTATGAT GGTTAGCAAA ACGAATTAGA AGATGGAAAG TATACGTAAA AAATATGAAA 6060 

AATCACGCTA TCATTGCACT GAATGTTAGC GTGATTTTTA TATATTAATT AAGCCTGAGT 6120, 

25 TGAACTAGTA TATAATCGTT GGTTTTTAGT GATTTTCAGC GATATCTTCT ACAATTCCAA 6180 

TGATTACTTG TACTGCTTTT TCCaTAACAT CAATGGATGC aTATTCATAT GGGCCGTGGA 624 0 

AGTTACCGCA ACCTGTAAAG ATGTTTGGAG TTGGTAACCC CATAAATGAC AATTGTGAAC 6300 

CATCTGTACC ACCGCGAATA GGTTCAGTGT TTGCTGGAAT AT CTAATTTG GCAAAGACAC 6360 

GTTTAGGTAT ATCAATAATA TGAGGCAATG GTAATATTTT TTCTGCCATA TTGAAATATT 6420 

GATCCGATAT ATCAACTTTA ACTGGATAAT TTTCAAAATG GGCATTGATA TCGTCACGTA 64 80 

TTTCTAAAAT ACGTTTCTTA CGCAATTCGA ATTGTTTTTT ATCATGATCA CGAATAATGT 6540. 

ATTGCAAAGT TGCTTTTTCA ACAGTTCCTT CAAAGTTCAT TAAGTGATAA AAGCCTTCGT 6600 

ATCCTTCTGT TCGCTCCGGA ACTTCACTAT CAGGTAGCAA ACTATCGAAT TGTTCACCTA 6660 

AACGTATTGC GTTTACCATT GCATTTTTAG CTGAACCAGG ATGAACATTT ACACCGTGGC 6720 

ATGTAATAAC CGCTTCAGCA GCGTTAAAGC TTTCATATTG TAATTCTCCA TATTGACTAC 6780 

4S CATCCATAGT ATAAGCAAAA TCAGCATTGA AGCGGTCAAC ATCAAATTTA TGTGGACCAC 684 0 

GACCGATTTC TTCGTCTGGT GTAAATCCAA TGCGAATGGT ACCATGTTTA ATTTCTGGAT 6900 

GTTCTTGTAA ATAACAAATA GCTTCCATAA TTTCCACAAT ACCCGCTTTA TCGTCTGCAC 6960 

SO CTAGTAACGA TGTACCATCA GTTACCATTA ATGTATGACC AACTAAACTG TTAAGTTCTG 7020 

GAAATACTTT AGGATCTAAG ACACGTTTAG TATTGCCTAG TTTGTATGGC TTACCATCAT 7080 
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w 



is 



20 



25 



30 



35 



40 



45 



SO 



GCGCCAAAAA TCCAACTGTT GGGACGTCGA CATCGATGTT ACTTTCTAAT GTAGCAAATA 
AGTAGCCATT TTCATCTAAA TCAGTTGGCA ATCCTAATTG TTGTAATTCT TTTTCTAATA 
AATGTAACAA ATCCCATTGC TTTTCAGTTG AAGGTGTTGT TGTAGATTTT GGATCAGATT 
GCGTATCAAT TGTCGTATAT CTTGTTAATC TATCTATCAA TTGGTTCTTC ATTATATTCG 
ACCCCTTAAA CT CT ATT ATT CATGTTGTAA GATTTTTTAT ATGTCTTACC TTTGATTTTA 
C CAT ACAGTT GTTTGATACG TGTGTATAGG TAATATXGAA TTTCAGAAAC TAATATACCG 
AAAGCAATCG CAC CTGAAAT CAGTGTAcTT CTAAAAATGT ATTTACAGCA CTTGTATAAT 
CATTTGATAC TAAAAAACGA GTCGCTTGAT AAGCTGCACC ACCAGGTACT AATGGTATAA 
TGCCTGGCAC TATGAATATA ATTACCGGTC GTTTATATCT GCGACTCATA GTATGACTCA 
TTAAGCCTAA AATTAAGCTT CCCAAAAATG AAGCGCCAAC TTTTCCAAAC TCTAAAT CTA 
CCGTTAATTG GTAAATCGTC CATGCAATGG CACCCACAAA TCCACATGCT ACTAAGAGGC 
GTTTGGGTGC ATTGAAAATG ATAGAGAAAA GTACTGTTGA TATAAAGCTG ATTGTAAAAT 
GAAATAAATA AAATAG CATG CTTTAACAGT CCTTCCTTAA ATGATTAATA AAACGATTGC 
GACACCAGCA CCGATTG CGA ATGCTGTTAA TGCAGCTTCA ACACCGCGAG ACATACCTGC 
AAGTAATTCA CCCGCTAATA AATCTCGAAT GGCATTGGTA ATTAATATAC CAGGGACAAG 
TGGCATGACA CTGGCTATAG TAATGATATC TTGATTGGTT GCAATG C CTA ATTTAGTAAA 
TGTGG CTGCA ATGGATATGA CCACAGCGGC TGCAACAAAC TCTGAGAAAA ATTTAATTTG 
TATATAGCGT tGCACAAAGC TGAATGTTAA AAATGCGGAT CCGCCAGCAA TGACTGCAAT 
CCAACAATCT GATGCGACAC CACCAAACAT AAATAGGAAG AAGC CACATG CAATGGCAGC 
TGCAAAGAAA TTCGTTAAAA AAGAATATTG TAATGATGCA TGCTGTAAAT GAATAAATTC 
AGATTTAGCT TCATCAATTG TGAGTTCTTT ATTTGATATT TTACGTGAAA GACTATTCGT 
TAAAGCGATT TTCTCTAAAT CTGTTGTACG CTCTTGTACA CGAATTAATC TTGTACTTGT 
TCGATCGTTT AATGAAAAAA TAATTGCAGT TGAACTGACA AAACTATATG TATTATGAAG 
ACCATAACTA TGTGCGATAC GGTTCATTGT ATCTTCAACT CGATATGTTT CAGCACCTGA 
TTC aAGTAAA ATTCTACCTG CAATTAATAC AACATCAATC ACTTTGTTTT CATCTATAAT 
TGTGATTGAA TCTGGCATAT CAATTCACCT CCAATGATAT GTGTTATTTA TTTGAACAAT 
TG aAGTTT AC AACTTGTTGT TACAACTTTC AATAGTGAGA CTTTGTGTTA GTATGATGAA 
CTTGTATGGT TCAAATTTAA ATAAGAAAAA CTGTTAATCT TTGCTATTAT ACTATGATTT 
AATAATAGCA AAGGATTAAC AGTTTTGTCG TTGTTATAAA TTGATAATAG GGTTAAACAT 



7200 

7260 

7320 

7380 

7440 

7500 

7560 

7620 

7680 

7740 

7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

8580 

8640 

8700 

8760 

8820 

8880 
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TTTACGCTGT GATTTTGGAT CGTCATCTGT TAAATAACCA ACACCGATAG ACACTGACAA 
TTTAATAACT TCTTTGTTTG GTAAATGGAA TGATGATTTT TCAACACCCG AACGAATATT 
TTCAGCTAAT TTAACACTTT GATCAAGTGA ATAATTCTGA ATGACAACTG AGAACTCTTC 
GCCACCATTT CTAAAAATTT TAAATTGATT CGGCACATAG TTTTTAAGTA ATTGAGACAT 
TTGTTTTAAT ACAGCATCAC CTGATTTGTG TGAGTAGGTA TCATTG a CAT CTTTAAATCC 
ATCGATATCG ATTAATAATA ATGCGATACT TTGATGTTCT TTTTCAGCTT TTCGTGAAAT 
TTCATTTAAA TGTCTATCAA ATTCTTTTAC ATTACCTAAG CCTGTTAAGT AATCATATTT 
;5 ATCTTCGTTT TCATAACGAT TTACGAGTGA GAAGAAATGC CAAATATCGA CAAATGTTAT 

CGCTGAAGCT AAAGTGATAA TTAATGAAAT TGGTATTAAA ATGATAACTT CCGATAGTGT 
GTAAATAGGA CTCACTAACG CGACACCAAA TAAAATGATT ATTGTAACAA CATTAAGTAT 
20 TAATAATGAT AGCACATCAT TTTGTTTTAA AAATGGTCCA ATAGCACTTG TTACTGCAGC 

AATAACAATC AACGTAACAC CGTACATAAT CGAGTTGTTA AATAGTACAA TTTCAACAAT 
TGCTACAATT ACTGTGG CAG ATAATGTATA GACCATATTT GTAAATCTAC CTAAAAACAA 
25 TAAAGGAACG AATGTTAAGT GAATTAAATA ATCTTCACGA TAAGGGATAG GGTAGACAGA 
TAATAATAAT GATACGATTG TCATTAAAAC AGTGACATAA GCCTTAGAAA AAAC 
(2) INFORMATION FOR SEQ ID NO: 38: 



30 



35 



40 



45 



SO 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 23439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



~(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 

TCTCAATCAG ATGAAAAATT GCATATCGTA GGTTTTACAG AAAGTGCAAA ATATAATGCG 60 

TCATCAGTCA TTTTCACGAA TGACGCTACC ATTGCCAAGA TCAATCCTAG ATTGACTGGA 120 

GATAAAATTA ATGCAGTTGT TGTACGTGAT ACAAATTGGA AAGACAAAAA ATTAAACCAA 180 

GAGCTTGAAG CGGTAAGTAT TAATGACTTT ATTGAAAATT TACCAGGTTA TAAACCACAG 24 0 

AACTTAACAT TAAACTTTAT GATTTCATTC TTATTTGTCA TTTCAGCTAC AGTTATAGGC 300 

ATTTTCCTAT ATGTCATGAC ATTACAAAAG ACGAGTTTAT TTGGCATATT AAAAGCTCAA 3 60 

GGATTTACGA ATGGCTATTT GGCGAATGTG GTAATTTCGC AGACGGTCAT ATTAGCACTA 420 

1TTGGTACGG CATTTGGCTT ACTGTTAACA GGCGTTACAG GTGCATTTTT ACCTGATGCA 480 
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15 



20 



25 



30 



35 



40 



45 



50 



TCTGTATTAG GAAGTTTATT CTCCATTTTA ACAATTAGAA AAATAGATCC GTTAAAGGCG 
ATTGGGTAGG AGGTGTAGCA AATGTTGAAA TTTGAAAATG TAACAAAGTC ATTTAAAGAT 
GGGAATCGTA ACATTGAAGC GGTTAAAGAT ACAAATTTTG AGATAAATAA AGGTGATATT 
ATAGCATTGG TTGGACCTTC TGGCTCTGGT AAAAGTACAT TTCTAACTAT GGCAGGTGCT 
TTACAAACAC CGACATCTGG GCACATTTTA ATCAATAACC AAGATATTAC GACAATGAAG 
CAAAAAGCAT TGGCAAAAGT TAGAATGTCT GAAATAGGTT TTATTTTACA AGCTACAAAC 
CTTGTACCAT TTTTAACGGT AAAGCAACAA TTTACATTAT TGAAAAAGAA AAATAAGAAT 
GTTATGTCTA ATGAAGACTA TCAGCAACTT ATGTCACAAT TAGGTCTAAC TTCATTGCTT 
AATAAGTTAC CTTCAGAAAT TTCAGGTGGT CAGAAACAAC GTGTGGCGAT AgCaAAGCGT 
TATATACGAA TCCGTCGATT ATTTTAGCGG ATGAACCTAC CGCGGCGTTA GATACTGAAA 
ATGCGATTGA AGTCATTAAA ATTCTACGTG ATCAAGCCAA ACAAAGAAAG AAAGCATGTA 
TTATTGTTAC ACATGATGAA CGACTTAAAG CATATTGTGA TCGTTCATAT CATATGAAAG 
ATGG CGTCCT TAATCTTGAA AATGAAACAG TAGAATAGTT TTATTAAGCC GGTACATCAT 
GTGCCGGTAT TTTTATGTTT ATGTATTATT TGAATAAACT TTCACATTGA ATTAATAATA 
ATTATTATCG AAAATCAGAA ATATTCCGTG AAATATAATA TTTTTTGTAG TAAAATGGCC 
TCTAAGTATT CAATATTTAA ATATGGGGAT TGAATATAAA ATTATCGTAA TGGGGGTCAA 
TGGTTATGGA TTTATTGATA GGTACTTTAT TTTTATTTTT GGTCTTAGTG ATTTTTACAT 
TATTTACATA TAAAGCGCCT AATGGTATGC GTGCCATGGG AGCATTAGCT AATGCAGCAA 
TCGCAACATT TTTAGTGGAA GCATTTAATA AATATGTTGG TGGCGAAGTA TTCGGTATTA 
AATTTTTAGA AGAGCTAGGA GACGCTGCGG GAGGTCTAGG TGGTGTCGCT GCCGCTGGAT 
TAAC&GCATT AGCTATCGGT GTGTCACCAG TATATGCATT AGTTATAGCA GCCGCGTGCG 
GTGGTATGGA TTTATTACCA GGTTTCTTTG CGGGTTATAT GATTGGATAT GTGATGAAAT 
ATACAGAGAA ATATGTGCCG GATGGTGTCG ACTTAATTGG ATCGATTGTC ATCTTAGCGC 
CATTAGCTCG TCTTATTGCA GTATTATTAA CGCCAGTAGT GAATAGTACA TTGATTCGAA 
TTGGTGATAT TATCCAAAGT AGTACGAATA CGAATCCAAT TATCATGGGT ATCATTTTAG 
GTGGTATTAT TACGGTTGTC GGCACAGCGC CATTGAGTTC AATGGCATTG ACAGCATTAT 
TAGGTTTAAC GGGTGTACCT ATGGCTATTG GTGCCATGGC AGCATTTAGT TCGG CATTTA 
TGAATGGGAC GCTATTCCAT CGCTTAAAAT TAGGTGATCG TAAGTCTACG ATTGCAGTAA 
GTATTGAACC TTTATCACAA GCAGATATTG TATCAGCCAA TCCAATTCCA ATCTATATTA 



600 
660 
720 
780 
840 
900 
960 
1020 
10B0 
1140 
1200 
1260 
1320 
13B0 
1440 
1500 
1560 
1620 
1680 
1740 
1800 

ieeo 

1920 
1980 
2040 
2100 
2160 
2220 
2280 
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ATGCGACAGG 


TACAGCTACA 


CCGATTGCAG 


GATTTTTAGT 


TATGTTTGGA 


TTTAATCATC 


2400 


CGACGACAAT 


TGTGATTTAT 


GGTGTAGTAA 


TGGCGATTGT 


AGGTGCGCTT 


GCAGGTTATC 


2460 


TTGGTTCAAT 


TGTATTTAAA 


AAATATCCAA 


TTGTTACTAA 


GCAAGACATG 


ATTAATCGAG 


2520 


GTGCAGTAGA 


CGCATAGCAT 


CATCATATTG 


AATAGTAAAA 


ACAAATAAAA 


CATAGTAACG 


2580 


TGATTCAGTC 


GATGTAACAG 


TCGATAATGA 


GTCACGTTTT 


TTTATAGAAA 


AATACAAGAC 


2640 


ATAAAAATGT 


CATAATTTAT 


TGTCGACAAA 


TATCATACTG 


TATAAACATT 


TATCATTTTC 


2700 


TCAAGTACCT 


TTTACACGAT 


GGAATGAACT 


TACTTTTTAC 


GAAATTATGC 


GTATTTTATA 


2760 


AACAAATATC 


ATTGATATAA 


CGGTAAATGT 


AAGCGTTTAC 


AACAGAAATA 


ACAGCATGCT 


2820 


ACGATATTTT 


TGTAAATTCA 


CTGATTCAAG 


TATTTTAAGT 


CAATATGAGG 


AGGGATGTTA 


2880 


TGAGCGATTC 


TGAGAAAGAA 


ATTTTAAAAA 


GAATTAAAGA 


TAATCCGTTT 


ATTTCACAAC 


2940 


GTGAACTTGC 


TGAGGCAATT 


GGATTATCTA 


GACCCAGCGT 


AGCAAACATT 


ATTTCAGGAT 


3000 


TAATACAAAA 


GGAATATGTT 


ATGGGAAAGG 


CATATGTTTT 


AAATGAAGAT 


TATCCTATTG 


3060 


TTTGTATTGG 


CGCAGCGAAT 


GTAGATCGTA 


AGTTTTATGT 


GCATAAAAAT 


TTAGTTGCAG 


3120 


AAACATCAAA 


TCCTGTAACG 


TCAACACGCT 


CTATTGGTGG 


CGTAg CAAGA 


AATATTGCTG 


3180 


AG AACTTAGG 


TAGGCTTGGC 


GAAACGGTCG 


citttttatc 


TGCTAGTGGA 


CAAGATAGTG 


3240 


AATGGGAAAT 


GATTAAACGA 


TTGTCCACAC 


CATTTATGAA 


TTTGGATCAT 


GTTCAACAAT 


3300 


TTGAAAATGC 


GAGTACAGGT 


TCATATACAG 


CTTTAATTAG 


TAAAGAAGGC 


GACATGACAT 


3360 


ATGGCTTaGC 


AGATATGGAA 


GTGTTTGACT 


ACATTACGCC 


TGAATTTTTA 


ATTAAGCGTT 


3420 


CACACTTATT 


GAAAAAGGCT 


AAGTGCATTA 


TTGTAGATTT 


GAATTTAGGC 


AAAGAGGCAT 


3480 


TAAACTTCTT 


ATGTG CCTAT 


ACCACGAAAC 


ATCAAATCAA 


ATTAGTTATC 


ACCACGGTTT 


3540 


CTTCCCCAAA 


AATGAAAAAT 


ATGCCTGATT 


CATTACATGC 


TATTGATTGG 


ATTATCACGA 


3600 


ATAAAGATGA 


AACAGAAACA 


TACTTAAATT 


TAAAAATAGA 


ATCTACTGAT 


GATTTAAAAA 


3660 


TAGCTGCTAA 


ACGCTGGAAT 


GATTTAGGTG 


TTAAAAATGT 


TATTGTGACA 


AATGGCGTGA 


3720 


AAGAACTCAT 


TTATCGAAGT 


GGTGAGGAAG 


AAATCATTAA 


GTCAGTTATG 


CCATCAAATA 


3780 


GTGTGAAAGA 


TGTTACAGGT 


GCAGGCGATT 


CATTCTGTGC 


TGCAGTAGTG 


TATAGCTGGT 


3840 


TAAATGGGAT 


GTCTACTGAA 


GATATATTAA 


TTGCTGGTAT 


GGTTAACGCA 


AAGAAAACGA 


3900 


TAGAAACGAA 


ATATACAGTT 


AGGCAAAACC 


TAGATCAACA 


GCAACTTTAT 


CACGATATGG 


3960 


AGGATTATAA 


AAATGGCAAA 


TTTACAAAAG 


TATATTGAGT 


ATTCTCGAGA 


AGTTCAGCAA 


4020 


GCACGGGAGA 


ACAATCAACC 


GATTGTAGCA 


TTAGAATCAA 


CAATTATTTC 


GCATGGTATG 


4080 
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GCCATTC CAG 


CAACCATAGC 


CATTATAGAT 


GGCAAAATTA 


AAATTGGTTT 


AGAAAGCGAA 


4200 




GATTTAGAAA 


TACTGGCAAC 


TAGTAAAGAC 


GTTGCTAAAG 


TATCTAGAAG 


GGATTTAGCA 


4260 


5 


GAAGTTATTG 


CGATGAAGTG 


TGTTGGTGCT 


ACTACTGTAG 


CGACGACGAT 


GATATGTGCT 


4320 




GCAATGGCTG 


GTATTGAATT 


TTTTGTTACA 


GGAGGTATTG 


GGGGCGTCCA 


TAAAGGTGCA 


4380 




GAACATACGA 


TGGACATTTC 


AGCAGACTTA 


GAAGAACTGT 


CTAAAACAAA 


TGTCACTGTT 


4440 


10 


ATCTGTGCAG 


GTGCCAAATC 


AATTTTAGAC 


TTACCTAAGA 


CGATGGAGTA 


TTTAGAAACA 


4500 




AAAGGCGTTC 


CAGTTATTGG 


ATATCAAACG 


AATGAATTGC 


CAGCATTCTT 


CACTCGCGAA 


4560 


15 


AGCGGTGTTA AGTTAACAAG 


TTCGGTTGAA 


ACGCCAGAAC 


GACTTGCTGA 


CATTCATTTA 


4620 


ACAAAACAGC 


AGTTAAATCT 


TGAAGGTGGC 


ATTGTTGTTG 


CTAATC CAAT 


TCCATATGAG 


4660 




CATGCCTTAT 


CAAAAGCATA 


TATTGAGGCA 


ATCATAAATG 


AAG CTGTTGT 


TGAAGCGGAA 


4740 


on 


AATCAAGGTA 


TTAAAGGTAA 


GGACGCCACA 


CCGTTCTTGT 


TAGGGAAAAT 


TGTAGAAAAA 


4B00 




ACGAATGGTA 


AAAGTTTAGC 


AGCAAATATA 


AAACTTGTTG 


AAAACAATGC 


GGCGTTGGGT 


4860 




GCTAAAATTG 


CTGTCGCTGT 


TAATAAATTA 


TTGTAGGTGA 


TGATACATGA 


ATATTTTATT 


4920 


25 


CGCTATCACA 


GGGATAGCAT 


TTG CACTATT 


TGTTGCGTTT 


TT ATT CAGTT 


TTGATCGTAA 


4980 




AAAAATAGAC 


TTCAAAAAGA 


CGTTAATAAT 


GATATTTATT 


CAAGTGTTGA 


TCGTGTTATT 


5040 




TATGATGAAC 


ACAACGATTG 


GTTTGACAAT 


TTTAACTGCA 


CTAGGTTCAT 


TTTTTGAAGG 


5100 


30 


GCTAATAAAT 


ATTAGTAAAG 


CAGG CAT AAA 


TTTTGTTTTT 


GGAGATATAC 


AAAATAAAAA 


5160 




TGGCTTTACG 


TTCTTTTTAA 


ACGTATTACT 


GCCATTAGTT 


TTTATTTCTG 


TATTAATAGG 


5220 




CATCTTTAAT 


TATATTAAGG 


TATTACCATT 


TATTATCAAA 


TATGTAGGTA 


TCGCTATTAA 


5280 


35 


TAAAATAACT 


AGAATGGGGC 


GCTTAGAAAG 


TTATTTTGCT 


ATTTCAACAG 


CAATGTTTGG 


5340 




GCAACCAGAA 


GTATATTTAA 


CAATAAAAGA 


TATTATTCCA 


AGATTATCTA 


GAG CG AAATT 


5400 


40 


ATATACAAXT 


GCGACGTCTG 


GTATGAGTGC 


TGTTAGTATG 


GCAATGCTAG 


GTTCATATAT 


5460 


GCAGATGATT 


GAACCCAAGT 


TCGTAGTTAC 


AG CAGTAATG 


TTAAATATTT 


TTAGTGCGCT 


5520 




TATCATCGCC 


AGTGTAATCA 


ATCCCTATAA 


ATCTGATGAT 


ACTGATGTTG 


AAATTGATAA 


5580 


45 


CTTAACGAAA 


TCCACAGAAA 


CTAAAACATT 


GAATGGAAAA 


ACAGGAAAAC 


CTAAGAAAGT 


5640 




TGCCT'ITTTC 


CAAATGATTG 


GTGATAGTGC 


GATGGATGGG 


TTTAAAATCG 


CTGTTGTAGT 


5700 




AGCCGTAATG 


TTGTTAGCAT 


TTATTTCATT 


AATGGAAGCA 


ATTAATATCA 


TGTTTGGTAG 


5760 


SO 


TGTTGGTTTG 


AACTTTAAAC 


AGCTTATTGG 


CTATGTGTTT 


GCACCAATCG 


CATTCTTAAT 


5820 




GGGGATTCCA 


TGGAGCGAAC 


TGTTCCAGCT 


GGCTCTTTAA 


TGGCGACTAA 


ATTAATTACA 


5880 
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CAAGGTATCA 


TTTCAGTTTA 


CTTAGTAAGC 


TTCGCTAATT 


TTGGTACGGT 


TGGTATCATC 


6000 




GTAGGTTCAA 


TTAAAGGCAT 


TAGTGATAAA 


CAAGGAGAAA 


AAGTTGCATC 


CTTTGCAATG 


6060 


5 


AGGTTGCTAC 


TTGGTTCAAC 


TCTAGCTTCA 


ATCATTTCAG 


GATCAATCAT 


TGG CTTAGTA 


6120 




TTGTAAATGA 


ATCGAAGTAC 


CTAAATTAAA 


TTCATGGCAA 


AGCTAAACCC 


CGTCACCAAG 


6180 


10 


TTGGCGCAAC AGCGcATgcA TAACTTAGTG ACGGGGTTTT ATCATAACAA 


TCTACTTTTT 


6240 


CGTAGCCGTT 


TTTGAAATGT 


ATGTTGATGG 

^» X w X X \J*± X VJVJ 


l l InxV- 1 ill 


1 L/WWil x\j 


TTAATCCCGT 


6300 




TATATCTTTT 


TTATGTTTTG 


AAGGGACAAT 


unrtuL 1 AAVj 1 


7AT7AT7A ?V OfTA a 
AX Al AnvjUiA 


AGACAAAAGC 


6360 


IS 


AACTGTAAAT 


GAAATGGTAG 


ATAPATAGAJV 


A\j\j Ivj/io X in 




CACCATTATA 


6420 




GACATAAGCA 


AAGATGATAt* 


CPA AT ATT A tt 




71 /""*TV pfV^ H Jfc TV"^ 

A UAL. (JU AATG 


TATT CGTACG 


6480 




TTTAGTGAAA 


ATACCAACTY? 




A vj V— Aft 1 vtVxA 




ATCCAGTCAC 


6540 


20 


AAACAAGAAT 


AAATCCCATA 


AG T ^TTTflA 


ATT a n 2V 2V ^ra 


Al 1 AAVj 1 A 1 A 


GTGACATTCC 


6600 




AAAACCGAAA 




x VJ/n X x An X 


ti A & nrnTr: f» a 
vjaAAL^ 1 VJl»A 


AAL» 1 1AAL x I 


CGTGTCGCTC 


6660 




GCTACCTTTT 


CCGAAGAAGC 


w x xwlx inni 


Vj X \.U/l X X \*nt\ 


A x nLAAuLnVj 


ATATAGAATT. 


6720 


2S 


TAAACTAGAT 


GAAATGGTAG 


ACTGTGCAG C 


GGCjGAVAATG 


\JV_ X uLnnl nA 


GTAATCCTGC 


6780 




TACAAATGGT 


GGCATCTCAG 


TCAAAATGAA 


n x n x uOLnU X 


ArAflATHATfi 
nUnun 1 VjiA X\» 


TATTGAAGCC 


6840 


30 


TTTTGGTAAA 


ACAGCTTCAT 


GTGTATAAAA 


TGAATAC*AGP 


n X iulnv.v<Ln 


TACCATAAAA 


6900 


TAAGGGTGCT 


GAAATTAAAG 


CTAGGATACC 


ATTTGTCCAT 

*» XXX VJ X w V— X 


AArGATTTAT 

/WwVJA X X X f\ X 


TTGTTTCTTT 


6960 




TAAACTATCA 


GAAGCTTGAT 


AACGCTGCAC 


GACGTCTTGA 

v*f»wv x x x un 


CT£Y2£'*TCiTf5T 

X VvW>> X w X w X 


ATTGATACAA 


7020 


3S 


GTTGTTGAAA 


ATATTTCCTA 


GGAAAATAAT 


TGGAATGGCA 


GCTGCCGCAG 


TATTTAGTTT 


7080 


CCAATTGTCT 


GCACTAATTA 


ATTTTTTGTG 


CTCAATCGCA 


TCTGCAAAGA 


CAGTGCCGAA 


7140 




accgCcttta ATGTTCACAA 


CACCTAGAAT 


AATAATAACT 


AAAGCGCCGC 


CTAATAAAAT 


7200 


40 


GACGCCTTGA 


ATGAAATCAC 


TCCAAACCAC 


ACCTTCGAAA 


CCACCTAAAA 


ATGTATATAA 


7260 




AATACATAGT 


AAACCAACGA 


GTGATGCAAC 


GATATAAGGG 


TTCATGTCTG 


ATACAGATGT 


7320 




GATTGCTAAT 


GTTGGTAAGT 


AGATAACAAT 


TGCAACACGC 


CCTAAATGGT 


AAACGACAAA 


7380 


45 


TAATAATGAG 


CCAATGACAC 


GTATGCTAGG 


GCCAAATCTA 


GCTTCTAAAT 


ATTCATATGC 


7440 




AGATGTTACC 


TTTAACTTTT 


TAAAGAAAGG 


GACATAGAAA 


TAAATAAGTA 


ATGGAATAAT 


7500 




TGCGACGATA 


GCAATGTTAC 


CAGCGATATA 


TGACCAATCT 


GTTAAAAATG 


CTTTCTCTGG 


7560 


SO 


TGTCGACATA 


AATGTAATCG 


CACTTAACGT 


AGTAGCATAA 


ATTGAAAAGC 


CAACTACCCA 


7620 




AGATGGCAAG 


CGACCACTTG 


CGGTAAAGAA 


ACTATTGGTA 


CTTTGGCTCG 


CGCGCTTGGT 


7680 
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10 



is 



20 



25 



30 



35 



40 



45 



TGTGCCAAAT 
TGTCTATAAA 
AGGTTTGAAA 
CAATGTTGGA 
CAGTTGGTAA 
GCGACCATTA 
AACTAACATT 
TTCGAGTAGG 
TAGCTCGCTG 
AATCATATTG 
AGTAAATGGA 



CCAACTTCTT TCATGGGCAA CATCCCCTTT ACAATGTATT GATTCTTTGA 
TCGTATTTTG CAATGAGTTG ATCTAATGTT TGTCGATGTG CTTCGTTAAA 
GGTCTTTTCG GTAATCCTGC ATCAATGCCA CGATGACGTA ATATTTCTTT 
TAAATCCCCA TTGATAACAC TGTTTCGATA ATGTCGTTTG AATCATGTTG 
GCTTCTTGAA TTTGACCTTG TCGTGCTAAG TCGAAGATTT TTCTTGCACG 
ACGTTATATG T AG AAC CAAT TGCACCATCT ACGCCAGAAA TCGTAGCTTG 
TCATCAAAGC CAGATAAGAT TAATTTGTCT GGGAATGCTT TTCTAATACG 
AAGAAGTTTG GCGCTGTATA TTTAACACCA ACAATTTTTT CATGATTAAA 
AATTGTTCAA TAGAAATATT CACACCTGTT AAATCTGGTA TTGCATAAAT 
TTCTGAGTTG CTTCGATAAT ATCGAAATAG TAATCTCTAA TTTCTTCAAA 

AGAAAGTGCA TCATAACCGA GTT CTGTGGC 



SO 



ATATTTTCCA AGTTCAATGG CTTCATTTAA ATCTAACGAA CCTACTTGAG CAATCAATTT 
CACTTTATCC CCAACTGCCT CTTTGGCAAC CTTGAAAACT TGCTTCTTCT GCTCTGTATT 
TAATAAAAAG TTTTCGCCTG AGCTACCATT TACATAAAGA CCGTCTAATT CTTCAGTTTC 
AATGGCATTT TGAGCAATTT GTTTAAGTCC TTGTTCATTT ACTTGACCAT TTTCATCAAA 
AGGAACGAGT AACGCTGCAT ATAAACCTTT TAAATCTTTG TTCATTATGA AGTCCCTCCA 
AAAATCATTT GATAATATAG TTTACAGCTA TAATTGTAAA CGCTATCATA AAATGTAACA 
ATATCTTTTT GAAAATTGTA GTCATATTTA TGTATAATTA ATGAAAATGT TTTTCAAAAT 
CAATAGAAAT GGAGTGAGTA AGGTGTATTA CATCGCAATC GATATTGGAG GCACTCAAAT 
TAAATCGGCA GTTATTGATA AGCAATTGAA TATGTTTGAC TATCAACAAA TATCAACGCC 
GGACAACAAA AGTGAG CTTA TTACTGACAA AGTATATGAG ATTGTAACAG GATATATGAA 
GCAATATCAG TTGATCCAAC CTGTCATAGG TATTTCATCA GCAGGCGTTG TTGATGAACA 
AAAAGGCGAA ATTGTATACG CAGGGCCAAC CATTCCGAAT TATAAAGGTA CTAATTTTAA 
GCGATTATTA AAATCACTGT CTCCTTATGT CAAAGTAAAA AATGATGTAA ACGCTGCATT 
ACTAGGCGAA TTGAAATTAC ATCAATATCA AGCAGAACGG ATCTTTTGTA TGACGCTTGG 
TACAGGCATT GGGGGTGCGT ACAAGAATAA TCAAGGTCAT ATTGATAATG GTGAGCTTCA 
TAAGGCAAAT GAAGTTGGGT ATTTATTGTA TCGTCCAACT GAAAATACAA CGTTTGAGCA 
ACGTGCTGCA ACGAGTGCAT TGAAAAAGCG CATGATTGCC GGAGGATTTA CGAGAAGCAC 
ACATGTGCCA GTATTGTTTG AAGCAGCTGA AGAAGGTGAT GATATTGCAA AACAAATATT 



7800 

7860 

7920 

7980 

8040 

8100 

8160 

8220 

8280 

8340 

8400 

8460 

8520 

6580 

8640 

8700 

8760 

8820 

8880 

8940 

9000 

9060 

9120 

9180 

9240 

9300 

9360 

9420 

9480 
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AGGGCTTATA 


TTAATTGGGG. GCGGTATATC 


TGAACAAGGA 


GATAATCTCA 


TTAAATATAT 


9600 




CGAGCCGAAA 




ATTTACCAAA 


AGACTATGTT 


TATGCACCAA 


TACAAACGAC 


9660 


5 


TAAGAGTAAA 




CATTATATGG 


CTGTTTGCAA 


TGATAGTTGA 


AAGAAGGAGT 


9720 




CATTCTAAAA 


T* h/"? TV ft 'ITT^" A 

1 AuAAlii G A 


AACCGl iACG 


AGAGATGAGA 


GCTGTTGTTA 


GTTCCACACA 


9780 


10 


TCACACTCTA 


m/'»m » IV f*f* * 

T CTAGGACCA 


ATCTAAACTA 


TATCAACCAA 


CAGTGTGCCA 


CGGGCAAATT 


9840 




AAATTGAAGA 


AGCTGAGATA 


TTAAAATTTT 


AGAAAATGTA 


AAAAAATATT 


TGGTATTGAA 


9900 




ATTAAAAAAG 


CACCTAG CAA 


CTCGTTGGGA 


CAATCACGAT 


GATTGTCTAC 


AGTTGCAGGT 


9960 


15 


GGATTTGAAT 


ATACTACTAG 


TTATTTGTTG 


TCTAGGATAA 


TAGATTTAGT 


ATGTTGATAA 


10020 




GTTTGACTCA 


GATTCGTATT 


TTCTAATAAA 


TGATAACTCA 


CGATATCGAT 


TAAAAAGAGT 


10080 




GTCGCAATTT 


GTGTGTTGAT 


AAATTGATGG 


TCGGTATTAC 


GCGATTGATC 


CGTTGTTAAA 


10140 


20 


AGTACTAAAT 


CTGCACAATC 


TGTAAGTTTA 


CTACCTTCAA 


AATTTGTGAT 


GGCAACGACA 


10200 




TATGCACCAT 


GAGATTTGGC 


GACTTC CGCT 


GCAGAAATTA 


ATTCCGAAGT 


ATTACCACTA 


10260 




TTTGACATAG 


CAATAAACAT 


ATCCGAATGA 


GATAGTAGGG 


ATGCCGATAT 


TTTCATTAAA 


10320 


25 


TGTGAATCGG 


TAGTAACATT 


ACCTTTTAGC 


CCCATACGAA 


TCATACGOTA 


ATAAAATTCA 


10380 




GTCGCTGATA 


AACCAGAGCT 


ACCTAGTCCA 


GCAAAGAGTA 


TATGTCGACT 


TGATTGAAGT 


10440 




TTGTCGATAA 


AGGTTTGGAT 


AATGTCGTTA 


TCAATAAATT 


CACCAGTTTG 


TTGAATGATT 


10500 


30 


TGTTGATGAT 


ATTTATGAAT 


TCTTTGAATA 


ATTGGGCTAT 


TTTCAATAAC 


TGTCTCTGTC 


10560 




ATTTCTTGTT 


GAATATTAAA 


TTTTAAATCT 


TGGAAATTCT 


CATAATCCAG 


CTTATGACTA 


10620 


35 


AAGCGTGTCA 


TCGTTGCTGG 


TGATGTACCA 


ATCGCATGGG 


CTAAGGAGTT 


AATCGTTGAA 


10680 


AAGGCATCGC 


T AT AAC CATT 


TTGTCTTATA 


TAATTGACGA 


TG CGTTTATC 


aottittuta 


10740 




AATAAATGTT 


GATAACGTTG 


AACACGATTC 


TCAAATTTCA 


TTGTGTCACC 


CCTTCATCTT 


10800 


40 


AATGATTACT 


• hi* mn »TV» 

ATTATATATG 


AAAAATATTT 


TCAAGATAGT 


AAAAAGCATT 


GATAAAAATT 


10860 




ATCTTAATGA 


TATATTGTAA 


ATGACTTTAC 


GTGAAAAAAC 


GACTTATGGA 


GTGAGGAATA 


10920 




ATGTTACCAC 


ATGGATTAAT 


AGTATCTTGT 


CAGGCACTAC 


CAGATGAACC 


ATTGCATTCA 


10980 


45 


TCTriTATTA 


TGTCGAAAAT 


GGCATTAGCT 


GCGTATGAAG 


GTGGTGCTGT 


TGGTATTCGC 


11040 




GCAAATACTA 


AGGAAGACAT 


TTTAGCAATT 


AAAGAAACGG 


TAGATTTACC 


AGTTATTGGC 


11100 




ATTGTGAAAC 


GTGACTATGA 


TCACTCAGAT 


GTTTTCATTA 


CTGCAACGTC 


AAAAGAAGTT 


11160 


50 


GATGAACTGA 


TAGAAAGCCA 


ATGTGAAGTC 


ATTGCATTGG 


ATGCAACGTT 


ACAGCAACGT 


11220 




CCGAAAGAAA 


CGTTAGACGA 


ATTAGTATCA 


TATATTAGAA 


CACATGCACC 


GAACGTTGAA 


11280 
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TATATTGGCA CGACGTTACA TGGCTATACT AGTTATACGC AAGGACAATT ACTTTATCAA 11400 

AATGACTTCC AATTTTTAAA AGATGTACTA CAAAGTGTTG ATGCAAAAGT TATTGCGGAA 11460 

GGTAATGTCA TTACACCGGA TATGTATAAA CGTGTGATGG ACTTAGGCGT TCATTGTTCA 11520 

GTCGTTGGTG GTGCGATAAC ACGACCAAAA GAAATTACGA AACGTTTTGT TCAAATTATG 115 BO 

GAAGATTAAA TGATAACGAT AAAAAAACGA GATGACCATC ATTAATTAAA GGCACCTAAT 11640 

TATCTTAGGT GGCTGAATGA ATGTAATGGG TTCATCTCGT TTTGTTTGTT TATGATAGTG 11700 

ATTTTATTTT CAACTTTATC CAAAAATAAG TAAAGCGACG GGGATGGTGA TTAATAGCGA 11760 

CAACGCCACG CGTAAAAACC AAATGATGAT GAGTTTCCAG ACAGGTATTT TAATTTCAGT 11820 

TGCTAGTATA CATGGCACTA ATGCTGAGAA AAAGATAATG GCTGATACGC TTACTACACC 11880 

GACGACAAAT TTAGTACTCA TTGCAGCTTT AGTTACTAAC AAAGATGGTA GAAACATCTC 11940 

TACAATAGAA AckCTGACGC TTTTGCTAGT AAAGCCTGAT CAGCAATTGG GAAAATATAA 120 00 

ATAAATGGAT AGAAGATATA GCCAAGCCAA TCAATGAATG GTGTATAGTT CGCTACAATC 12060 

AGTCCTAAAA AACCAATCGA TAATATAGAA GGTAAAATAC CAACAGTCAT TTCTAAACCG 12120 

25 TCTTTCAAAT TGTCCCAAAC GTTCTTCACG AGAGATGGTG TTAATGCATT TTGTTTCATC 12180 

GCCTCTGCAT ATGCAGTTTT CAGTCTGCTT CCTTCAATAG CAACTTCTTG TTCTCCTTCT 12240 

TGTCCGTTAT AATATTCTGT TGATTCATTG CTGATTGGCG GTAGCCATGC AGTAATTGCA 12300 

30 GTCACGACAA ATGTGATGAC TAAAGTTATC CAAAAGTATA AATTCCAATG CGGCATTAAT 12360 

CCTAAAGTTT TAGCAACGAT AATCATAAAA GTTGCTGAAA CTGTTGAAAA GCCAGTCGCA 1242 0 

ATAATCGTGG CTTCTCGTTT GTTGTACATC CCTTGCTTAT AGACACGATT AGTAATCAAT 12480 

35 AATCCTAAGG AATAACTGCC GACAAACGAA GCCACTGCAT CGACAGCGGA TTTTCCTGGT 1254 0 

GTTETAAAAA TAGGTCTCAT AATAGGCTCC ATATAAACAC CGACAAATTC TAATAAGCCA 12600 

TAGCCCACTA ATAAAGAAAG cGcAATTGCA CCTACTGGAA TTAAGATACT TAATGGCATC 12660 

ATTAATTTTT CAAACAAAAA CGGACCATAG TTAGCTTTAA ATAGTATTGA TGGACCGATT 12720 

TTAAATACAT ACATTATACC GATCATTGCA CCTGCAACTT TAAATAATGT AATGACCAAG 12780 

TTTGTGATTG AAGTCATAAA AGTACGTCTC ACTATTGGTA ACGCTGTACC AATTAAAATC 1284 0 

ATAATCAGTG CAACATAGGG CATAAGTGGA CCTATGATTG AGCGAATGGC TAGATGAACA 12900 

TGATCGACGA AAATAGTGTT GTTACCATTA ATCGTAAAAG GAATAAAGAA ACATAGTATG 12960 

CCCACTAAAC TATAGACAAA AAAACGCCAT GCACTTGGTT GTTGTGCATT AGAATGATAT 13020 

TGATTCATTA AAGCAACCCC TTTGTTTAAA TGAATACACA AAACTGTATG ATGCATCTTC 13080 
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ATAGTTTGAA 


TTATTTTCAT 


ACCAATACAA 


ATTAACTAAT 


TATATATAGA 


TTGAAACTAT 


13200 




ATTACTTAAT 


AAAAT ATTTA 


TCTTAAATGT 


TGTTGTGTTG 


ATTCAACACC 


ACAACTAAAA 


13260 


5 


GTGTTTATAA 


ATTATTTGGA 


AATACACATA 


TTTGTAAATG 


ATTAGTATCG 


ATTTAATATC 


13320 




GTATTATTAA 


ATTTTTATTA 


ATTTTGTAGT 


CTTAATCmAA 


AAATAATATA 


TGTCATGTTA 


13380 


10 


TATTGAAGGT 


GCAGTTGTTT 


TTCATTCTCA 


AGAGGGGGTC 


AAAAAAATAC 


TTTTGAGGTG 


13440 


ATTATATGTT 


AAGAGGACAA 


GAAGAAAGAA 


AGTATAGTAT 


TAGAAAGTAT 


TCAATAGGCG 


13500 




TGGTGTCAGT 


GTTAGCGGCT 


ACAATGTTTG 


TTGTGTCATC 


ACATGAAGCA 


CAAGCCTCGG 


13560 


IS 


AAAAAACATC 


AACTAATGCA 


GCGGCACAAA 


AAGAAACACT 


AAATCAACCG 


GGAGAACAAG 


13620 




GGAATGCGAT 


AACGTCACAT 


CAAATGCAGT 


CAGGAAAGCA 


ATTAGACGAT 


ATGCATAAAG 


13680 




AGAATGGTAA 


AAGTGGAACA 


GTGACAGAAG 


GTAAAGATAC 


GCTTCAATCA 


TCGAAGCATC 


13740 


20 


AATCAACACA 


AAATAGTAAA 


ACAATCAGAA 


CGCAAAATGA 


TAATCAAGTA 


AAGCAAGATT 


13800 




CTGAACGACA 


AGGTTCTAAA 


CAGTCAGACC 


AAAATAATGC 


GACTAATAAT 


ACTGAACGTC 


13860 




AAAATGATCA 


GGTTCAAAAT 


ACCCATCATG 


CTGAACGTAA 


TGGATCACAA 


TCGACAACGT 


13920 


25 


CACAATCGAA 


TGATGTTGAT 


AAATCACAAC 


CATCCATTCC 


GGCACAAAAG 


GTAATACCCA 


13980 




ATCATGATAA 


AG CAG CAC CA 


ACTTCAACTA 


CACCCCCGTC 


TAATGATAAA 


ACTGCACCTA 


14040 




AATCAACAAA 


AGCACAAGAT 


GCAACCACGG 


ACAAACATCC 


AAATCAACAA 


GATACACATC 


14100 


30 


AACCTGCGCA 


TCAAATCATA 


GATGCAAAGC 


AAGATGATAC 


TGTTCGCCAA 


AGTGAACAGA 


14160 




AACCACAAGT 


TGGCGATTTA 


AGTAAACATA 


TCGATGGTCA 


AAATTCCCCA 


GAGAAACCGA 


14220 




CAGATAAAAA 


TACTGATaAT 


AAACAACTAA 


TCAAAGATGC 


GCTTCAAGCG 


CCTAAAACAC 


14280 


35 


GTTCGACTAC 
TACAACCACT 


AAATGCAGCA 
TAACAAATAT 


GCAGATGCTA 


AAAAGGTTCG 


ACCACTTAAA 


GCGAATCAAG 


14340 




CCAGTTGTTr 


TTGTACATGG 


ATTTTTAGGA 


TTAGTAGGCG 


14400 


40 


ATAATGCACC 


TGCTTTATAT 


CCAAATTATT 


GGGGTGGAAA 


TAAATTTAAA 


GTTATCGAAG 


14460 


AATTGAGAAA 


GCAAGGCTAT 


AATGTACATC 


AAGCAAGTGT 


AAGTG CATTT 


GGTAGTAACT 


14520 




ATGATCGCGC 


TGTAGAACTT 


TATTATTACA 


TTAAAGGTGG 


TCGCGTAGAT 


TATGGCGCAG 


14580 


45 


CACATGCAGC 


TAAATACGGA 


CATGAGCGCT 


ATGGTAAGAC 


TTATAAAGGA 


ATCATGCCTA 


14640 




ATTGGGAACC 


TGGTAAAAAG 


GTACATCTTG 


TAGGGCATAG 


TATGGGTGGT 


CAAACAATTC 


14700 




GTTTAATGGA 


AGAGTTTTTA 


AGAAATGGTA 


ACAAAGAAGA 


AATTG CCTAT 


CATAAAGCGC 


14760 


SO 


ATGGTGGAGA 


AATATCACCA 


TTATTCACTG 


GTGGTCATAA 


CAATATGGTT 


GCATCAATCA 


14820 




CAACATTAGC 


AACACCACAT 


AATGGTTCAC 


AAGCAGCTGA 


TAAGTTTGGA 


AATACAGAAG 


14860 
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ATTTAGGATT AACGCAATGG GGCTTTAAAC AATTACCAAA TGAGAGTTAC ATTGACTATA X5000 

TAAAACGCGT TAGTAAAAGC AAAATTTGGA CATCAGACGA CAATGCTGCC TATGATTTAA 15060 

CGTTAGATGG CTCTGCAAAA TTGAACAACA TGACAAGTAT GAATCCTAAT ATTACGTATA 15120 

CGACTTATAC AGGTGTATCA TCTCATACTG GTCCATTAGG TTATGAAAAT CCTGATTTAG 15180 

GTACATTTTT CTTAATGGCT ACAACGAGTA GAATTATTGG TCATGATGCA AGAGAAGAAT 15240 

GGCGTAAAAA TGATGGTGTC GTACCAGTGA TTTCGTCATT ACATCCGTCC AATCAACCAT 15300 

TTGTTAATGT TACGAATGAT GAACCTGCCA CACGCAGAGG TATCTGGCAA GTTAAACCAA 15360 

TCATACAAGG ATGGGATCAT GTCGATTTTA TCGGTGTGGA CTTCCTGGAT TTCAAACGTA 15420 

AAGGTGCAGA ACTTGCCAAC TTCTATACAG GTATTATAAA TGACTTGTTG CGTGTTGAAG 15480 

CGACTGAAAG TAAAGGAACA CAATTGAAAG CAAGTTAAAT TCATCTTCTG AATTTAATAT 15540 

GCTATGTAAA TCGTGCTGTT ATCATGGCAC ATCAGATATA AGTAG CATCA CAGTGTTGAA 156 00 

TTTAAAAATA GTAAAGTGAA ATAAAGCGCC TGTCTCATTA GCGAAAACTA AAGGGACAGG 15660 

CGTATCTGTT TATGAGCTTA ATAAATTGTA TGAATAATAT GGTTGATCGA ATAACTGTTT 15720 

25 ATCATGATGA TAAATTGAGT TTTTTAAAAT AATGATATAT TACATCATTC TTATAGCGTT 15780 

TAAGAAATCA ACAACTTTAC GATAAATAGT GATTGCTTCG TCATTAGGTC TACGATCAAA 15840 

ATCATGCTCG TTTTTATTCA CGCGTTCAAA TGTTGAATGT GGAACATGAT TCATGATATG 15900 

30 TTCGCTTTCC TCAACGGGAA CATCATAATC GCCATTACAA TGCGCAATGA AAACAGGTGG 15960 

AAGTGTTTTA AGTTCATCTG GTGCAATATT ATATTTTGAA TTAGTATAAT CAGCAATGTT 16020 

AATCATATTT ATCCATTTAC CTGTGCCACG TGCATAAACG TAGATTAAAA AACGTTGTGC 16 080 

35 GATTTGATCT TGAACAACCG GTGTTGGTGA AGTGAGTTGT GCAATCATTG TTTCGTTTAC 1614 0 

GCTTTGAGCT ATTTTTGCGT AATAACTATT AGTTGTTTTA AAAGGTTCAG TGTTGATGCG 16200 

ACTATAACCA TAAAAATCAA TAACACCATC AATATCTCTG TCTCGTGCAA TTAATAGACT 16260 

TAAATATGCA CCTGATGATC TGCCAAAGGT AAAAATAGGG CAATTAGAAT ATTGTGATTG 16320 

AATCGCATCG AATGAtGCgn AGnACATCCT CAATAATGCA ATCGAGACTT ACTTCTGGTA 16380 

ATAAACGATA ACTTAGTTGA ATTAAATCGT AATGTTCCGT AAgATATCGA TATACTGTGG 16440 

GGATAAATCG TTAGCTTTAC CGAACATTAA TCCACCACCG TGGATGTAGA CAATAGCGCC 16500 

TTTTGTTGGT TGATTTTTTG CTTTAATAAT TGTGTAAGGT AATGCAAATG CATCTTTAGT 16560 

AATTACTTTA TCTTTAATTT CAGTCACGAT TTAATAGGCT CCTTATTTTT GATATTGATG 16S20 

TCATTATAAC ACTGTCTTAA ATTTCCATGA AAAATAGTCT TAAGACGATG AGTCATGATA 16680 
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CATCATTTTA 
CATAGAAAAT 
AGGTTCTGTG 
AGCAGACTCT 
TTCATAGACA 
TTGATAAATA 
GACCCAATTG 
TTCTTCTGGC 
TTCTTGTACC 
AC CGT AATGG 
TTGTGCAATC 
GCGATACTGC 
AATCGGTGAA 
TTGATTTGAA 
CAT CGTAGAT 
ATATTCAAAC 
TGAATTAGAG 
TGATGTGTTT 
TAAGTGGATT 
AATATTGTTT 
AATAGAATTG 
CTGAAAATAT 
AACGTTATCA 
TGTCGGTTTG 
AGA CTATGCC 
ACTTGCAAAA 
TGTACCAGCG 
AAATCACATA 
TGCTGAGCAT 



ACAATATCTT 
T CAAG ATTGA 
ACAAAAGGCG 
GGAGAATTAA 
ATTTGGTTAA 
TGTCGACCCC 
ATTAATTGGT 
GTTGCTCGAA 
GCACGCATAA 
TCTGTACGTT 
GAAATTTCCA 
TGAATGATGC 
TGCAATGTCA 
AAACGACCAG 
GCCATGTTAG 
AATTGACCAT 
CGACGTGCAG 
ACGGTCATTG 
GATTGTAAAA 
TCAAAGTACC 
GTACATGGAA 
GAATATGAAA 
TATACGTGGG 
ACATGACAGG 
TTAATAGATG 
TTAGCAGATC 
TTTGCGTGTA 
CGAGTTGGCT 
TTTAGAATGA 



TAAAAGCAGC 
TATCATGTGG 
AAGACATGCC 
TCCCGCCACT 
CTGGTCGACC 
AGCTAG CG AT 
TGAACTCGTC 
ATCCTAAAAT 
CTTCTAAACA 
TATTCGAAAA 
CACCATCAAA 
TATTGATTTT 
TAGGGCTTGG 
CATGCGCTAg 
TTAATCCAGG 
AAGGTTCAAT 
CATAAGC CAA 
GTGATAATAC 
GTGGTTTGTA 
ATGGAAAGAA 
AGTATTTTTA 
AAGAAAAATA 
TATATGAAGA 
ATAAGTTTGG 
AAGGTAAGGA 
GACTTGGCTT 
GTAGTCCAGA 
CTGGTGGTGT 
TGGCAGCGTT 



ATGTGGAATG 
TCGCTGTTCA 
GACCATATCT 
TGCAATTAAA 
GAAATGATCA 
TGGTAAGTAT 
AATGGTATAT 
AAAATTGTCA 
TAATCTTGCA 
AGTTGAGAAA 
ACCTGCTTTA 
CTCATGAGAC 
TCCATACACC 
CTGGATAATA 
GATACAAGCA 
GTAAGCAGCG 
GTCTTCTTTT 
AAAGCGATTC 
TCGGTACATA 
TGAATAATCA 
AAATTAAACT 
AAGGCGAAAA 
GGGAATGGTA 
AGATGACGGA 
TGCACAAAAG 
TAAGCGAATT 
ACTTTTGATG 
GATGCTGCCG 
ATATCCAAAT 



GCTAAATCTT 
GCAAGTTTAT 
GCATGTTGTA 
GGGATACGAC 
CCTGGTGTAC 
TGGATGTTTG 
CCTAAATCAC 
GGTGCTTCTT 
CGATTTTTTA 
AATGTTTGAA 
ATCGCGCGTA 
ATGGCGATAA 
TTTCCAAAAT 
GCGAGGCTAC 
TCATGATCAA 
CCGGTGACTT 
GTAATATAGC 
GAAATTTTGA 
CTATGATTCC 
ATGATGAACA 
AATGAATGGC 
GATATAAAAG 
TTAAGAACGC 
TTGGTTAAAT 
GCATTGCAAG 
TGGTTTACGG 
ATGCATACAT 
CACTATCGAC 
CGTATTGATT 



CTAAATCTGC 
GCACAAAGTC 
AAGCATCTAA 
CTGCTAAATG 
GAGACGTATT 
AAACGTCCAT 
TGCCTCTGGT 
TATCAATCAC 
ATGAGTCGGC 
TCAGCAAACG 
ATGTAGCATC 
CATCGTGTTC 
TTAAAATGGC 
CATGTTGTTT 
TATTAAAGCC 
GCATTCCAGC 
CTTCTTTTGT 
TGCCATTAGG 
TTTTCTATTC 
GTCTTGATAG 
ATTTGTAGGT 
TTAATTGAAA 
TAAAATGTTA 
TAAGCGTATT 
ATTCAGTGAC 
AACATCATAA 
TGGCGCAGAC 
CTTATAAAAT 
TAGGTATTGG 



16800 

16860 

16920 

16980 

17040 * 

17100 

17160 

17220 

17280 

17340 

17400 

17460 

17520 . 

17580 

17640 

17700 * 

17760 

17820 

17880* 

17940 

18000 

18060 

18120 

18180 

18240 

18300 

18360 

18420 

18480 
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TAGTTACGAT 
TGCGCATACG 
TAGTAGCGCA 
ATTTTTGCTA 
AAAACATTTC 
CATTGTAGCT 
ATTAGGTAAA 
GTATAAGCTT 
AGGTACACAA 
TGAGGTGTTA 
ACTCGCGGAA 
AAAGTTAGCC 
ACCTGAATTG 
AGTTAAAGTG 
AACGCCATTG 
TTTAAACTCT 
AGCATTCCTA 
TGAATATTTA 
ATCTTCATTT 
ACCTGTAAGT 
ACAU5TTACG 
AGGTGATATC 
AGGATGTATG 
TCAAGTTCGA 
TAAAGCCAAA 
TCCGCAAATA 
TCTTAGCTGT 
ATCTACAGGT 
AGAAAGCTAT 



GAATCGATTT CGTTATTACG TGATTATCTT ACAATAAAGG 
TTAGGTGTCC AACCACACAT TGATCATTTT CCAGAAATGT 
ACATCTGCCA AAATAG CTGC CGAACTAGGT ATAGGGCTTT 
CCAGATATAA ATGCGATACA TACAGCGAAG GATAACATTG 
CAAGCATCAA CGATTAAAAT GGACGCAAAG GTGATGGCAT 
GATAACGAAG CGGAAGTAGC AGCATTACAA CATGCCTTAG 
TTACAATTTG CAGAATTTGA AGATTTTCCT TCAGTAGACA 
AATGATCGAG ACAAAGAGAT GATTCAAGCA CATCAAGCAC 
GAAAAGGTTA AAGCACAATT AGATGATTTC ATTGCTACGT 
GTAGCACCGC TTATTCCAGG TATTGAACAG CGTTGTAAAA 
ATTTATTTGT AGCATTTTAA ATAGAAGAGA AAGGATGAAG 
AATTATTTAT GGGTAGAAAA AGTAGGAGAT TTGTATGTGT 
GAAGATGATA TTGGGACAGT AGGTTATGTT GAATTCGTAA 
GATGATGAAA TTGTGAGTAT CGAAGCATCG AAAACGGTCA 
TCAGGAACGA TTATTGAGCG AAATACAAAA GCGGAAGAAG 
GAAAAACCAG AAGAAAATTG GTTGTTCAAA TTGGATGATG 
GCATTACCGG AGGCTTAAAT GGAAACGTTA AAATCAAATA 
ATCAATGATA TGCATCGAGA GAGAAATGAC AATGACGTAT 
GAAGATTTGT GGGAATTATA TCGAGGCTTA GCAAATGTCA 
GATGAATATT TAG CTGTACA AGATGCTATG TTAAGTGATT 
GATTTGAAGG ATTTGAAGCC GATAAAAGGT GACAATATCT 
ACGACGTTAA AAATCGATGC TATTGTTAAT GCTGCAAATA 
CAAGCTAATC ATGACTGCAT TGATAATATT ATTCATACAA 
CTTGATTGTG CAGAGATCAT TCGACAACAA GGGCGCAATG 
ATAACACGTG GATATAATTT GCCAGCAAAG TATATAATTC 
CGTCGATTGC CTGTTTCAAA GATGAATCAG GACTTGTTAG. 
CTTAAATTGG CTGATCAACA TAGTTTAAAT CATGTCGCTT 
GTATTTGCTT TTCCTCAAGA TGAAGCAGCA GAAATTGCTG 
CTCAAAGAAA CAAATTCAAC ATTGAAAGTC GTGTTCAATG 



ATAAACCAAG 
GGTTATTAAG 
CTGTTGGAAC 
ATATTTACAA 
CTGTATTTGT 
ATGTTTGGTT 
CAGCACAAAA 
GCATCATTGC 
TTGAAGTTGA 
CATTAAAATT 
ATAAGATGAA 
TTAGTATGAC 
GTCCAGATGA 
TTGATGTGCA 
AACCGACAAT 
TCGATAAAGA 
AAGCGAGACT 
TGGTAATGCC 
GACCGGCATT 
TGAATCGTCA 
TTGTTTGGCA 
GTCGTTTTCT 
AAGCGGGTGT 
AAGGTGTAGG 
ATACGGTTGG 
CTAAATGTTA 
TTTGCTGTAT 
TTCGAACAGT 
TATTTACAGA 



18600 

18660 

18720 

18780 

18840 

18900 

18960 

19020 

19080 

19140 

19200 

19260 

19320 - 

19380 

19440 

19500 

19560 

19620 

19680 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 
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CAATGTCTCT GTTAATGGAT GACAAGACAA AGCAGGCTGA AGTATTGCGT ACTGCGATTG 20400 

ATGAAGCAGA TGCGATAGTG ATTGGAATTG GTGCAGGCAT GTCTGCATCT GACGGATTTA 20460 

CATATGTAGG AGAGCGTTTT ACGGAAAATT TCCCAGATTT TATTGAAAAA TATCGCTTCT 20520 

TTGATATGTT GCAAGCGAGT TTACATCCTT ATGGCAGTTG GCAAGAGTAT TGGGCATTTG 2 0580 

AGAGTCGTTT TATTACATTA AACTATTTAG ATCAACCTGT AGGTCAGTCT TACCTCG CTT 20640 

TAAAATCCTT GGTGGAAGGT AAACAGTACC ACATTATAAC TACGAATGCA GATAATGCTT 20700 

TCGATGTAGC TGATTATGAT ATGACTCATG TATTTCATAT ACAAGGGGAG TATATACTGC 20760 

AACAGTGTAG cTCAGCATTG TCATCCTCAA ACGTATCGCA ATGATGATTT AATTCGTAAA 20820 

ATGGTTGTTG CGCAACAAGA TATGCTTATA CCTTGGGAGA TGATTCCAAG ATGTCCAAAA 20880 

TGTGATGCCC CAATGGAAGT GAATAAACGT AAAGCGGAAG TTGGGATGGT TGAAGATGCT 20940 

20 GAATTTCATG CGCAACTACA TCGTTATAAT GCTTTTCTAG AGCAACATCA AGATGATAAA 21000 

GTGTTGTATT TGGAAATTGG AATTGGTTAT ACTACAGGAC AATTTGTGAA GCATCCTTTT 21060 

CAGCGTATGA CACGTAAAAA TGAAAATGCC CTTTATATGA CGATGAATAA AAAGGCATAT 21120 

25 CGCATTCCGA ATTCAATTCA AGAACGTACC ATACATTTAA CTGAGGATAT CTCAACATTG 21180 

ATT AC AG CAG CACTCCGGAA CGACAGCACA ACGAAAAATA ACAACATTGG AGAGACAGAA 2124 0 

GATGTACTTA ATAGAACCGA TTAGAAATGG AGAATATATT ACTGATGGTG CGATTGCACT 213 00 

CGCTATGCAA GTTTATGTTA ACCAG CATAT CTTTTTAGAT GAAGATATTT TATTCCCTTA 2136 0 

TTATTG TG AT CCAAAAGTGG AAATTGGACG TTTTCAAAAT ACTGCTATAG AAGTGAATCA 21420 

AGATTATATA GATAAACACA GTATTCAAGT AGTTCGCCGA GATACTGGTG GTGGCGCTGT 214 80 

GTATGTTGAT AAAGGTGCCG TTAATATGTG TTGTATTTTA GAACAAGACA CTTCAATTTA 21540 

TGGTSATTTT CAACGATTTT ATCAACCAGC TATAAAGGCG TTGCATACAT TAGGTGCAAC 21600 

AGATGTGGTA CAAAGCGGTA GAAATGATTT AACATTGAAT GGTAAAAAAG TGTCAGGCGC 21660 

CGCAATGACA TTAATGAATA ATCGTATTTA TGGCGGTTAT TCGCTATTAC TTGATGTTAA 21720 

TTATGAAGCA ATGGATAAAG TGTTAAAGCC TAATCGCAAA AAGATTGCAT CGAAAGGGAT 2178 0 

TAAATCTGTG CGCGCACGTG TTGGTCATCT TAGAGAAGCA CTGGATGAAA AGTATCGTGA 21840 

TATAACCATT GAAGAATTTA AAAATTTAAT GGTGACGCAG ATTTTGGGAA TCGATGACAT 21900 

TAAAGAGGCG AAACGATATG AATTAACGGA TGCAGATTGG GAAGCGATTG ATGAATTAGC 21960 

SO TGATAAAAAG TATAAAAATT GGGATTGGAA TTATGGCAAG TCACCCAAAT ATGAATACAA 22020 

TCGAAGTGAA AGATTATCTT CAGGTACGGT AGACATAACA ATTTCTGTTG AACAAAATCG 22080 
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AGAAGCATTA 


CAAGGAACAA 


AAATGACAAG 


AGAAGATTTA 


ACGCATCAGT 


TAAAGCAATT 


22200 




AGACATCGTT 


TATTATTTTG 


GCAATGTTAC 


GGTAGAAGCA 


TTAGTGGATA 


TGATTTTAAG 


22260 


5 


TTAATATTGT 


TATTTTATGT 


ATGCTGAATC 


ATTGGAAGTG 


TTTGCTTGCT 


CTTGAAAAGG 


22320 




TGACAATAGT 


GTTTGGTGAA 


GGTTGAACAT 


ATGAGTGGAA 


ATTATTGCCT 


TTAACTATTC 


22380 




AAAGTATGAT 


ATATATATGG 


T1TTTGTTTC 


TAAATGATTG 


GGTATTTGAA 


AATAGATGAG 


22440 


10 


TTTAAT ATTT 


TAAGGAATAT 


AATGATGTTT 


ACTTTTATAA 


TTCATATAGA 


ATATTAAGCA 


22500 




ATATAAGTCT 


GTTGATATAT 


ACAAAATATA 


ATGACTGCTA 


TAATGAGTAA 


TCAATAGACA 


22560 


1S 


CAAAGAGGAG 


ATTATGTGAT 


GAATAATAAA 


GTATTAGTAA 


CCGGTGGTAC 


AGGGTTTGTT 


22620 


GGCATGCGAA 


TTATTTCACG 


ATTATTAGAA 


CAAGGTTATG 


ACGTACAAAC 


GACGATACGT 


22680 




GATTTAAGTA 


AAGCTGATAA 


AGTAATTAAA 


ACAATGCAAG 


ACAATGGCAT 


TTCCACAGAG 


22740 


20 


CGATTAATGT 






CAAGATGAAC 


ATTGGGATGA 


AGCAATGAAA 


22800 




GATTGCAAGT 


ATGTCTTGAG 


TGTAGCATCT 


CCGGTGTTTT 


TCGGTAAAAC 


AGACGATGCA 


22860 




GAAGTGATGG 


GGAaCTGcAA 


TTGAAGGTAT 


ACAACGTATT 


TTAAGAGCTG 


CAGAACATGC 


22920 


25 


GGGTGTTAAA 


CGTGTGGTAA 


TGACTGCAAA 


CTTTGGTGCA 


GTTGGTTTTA 


GTAATAAAGA 


22980 




TAAAAATTCA 


ATCACAAATG 


AAAGTCATTG 


GACAAATGAA 


GATGAACCAG 


GCTTATCAGT 


23040 




ATATGAAAAA 


TCAAAATTGT 


TAGCTGAAAA 


GGCAGCGTGG 


GATTTTGTTG 


AGAATGAAAA 


23100 


30 


TACAACAGTA 


GAATTTGCCA 


CAATCAATCC 


AGTTGCAATT 


TTTGGGCCAT 


CATTAGATGC 


23160 




AGACGTTTCA 


LrOAALyV- 1 1 1L 


nl 1 Inl inun 


A A A'TTTATTG 

nnrt X X X r\ X ivj 


AATGGTTCAA 


TGAAACGTGT 


23220 




ACCGCAAATT 


CCGTTAAATG 


TTGTTGATGT 


GAGAGACGTA 


GCTGAACTGC 


ACATTTTGGC 


23260 


35 


AATGACAAAT 


GAACAAGCTA 


ATGGCAAGCG 


ATTTATTGCG 


ACGGCTGATG 


GACraAATTwA 


23340 




tTTGTTGGGA 


ATTGcCAAAt 


TAATTAAAGA 


AAAGGGCCTG 


GAAATAGCTC 


CAAAAGTTCC 


23400 




TACTAAAAAA 


TTACCCAGCT 


TTATTTTGAG 


CnAnGnGCC 






23439 



40 

(2) INFORMATION FOR SEQ ID NO: 39: 



( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4522 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
CCCTTTGAGA GTATATCATC TAGTCAAATT ATGCCTGTCA TTAGAGCGAC TAGCTTTGAT 
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TATTATGCAG 


! TCGATTTAGG 


> GAAATCATAT 


' CGTCTAATTG 


TV /"V - * 7\ » T± f^/~*1\ T V 

A LajAAACjCAT 


GTTAGAGGAT 


180 




TTGAAGTTAA 


• CTGAACAACA AATAAGAGAA ATGTCTCTGT 


X X nn lull M w 


AAAA1 T(jT CA 


240 


5 


AATTCATATA 


. CGACTGATGA 


AGTAAAAGGT 


AATATTTTTT 


S\X 1 1 X*\X Xf\i\ 




300 




GGGTATGATG 


CAAGTAGGAT 


ACTAAATACT 


GCATTTTTAA 


21 TYS 2121 21 Tm & 
« X Vjnnn X i\ja 


t^WVCAATGT 


360 


70 


CAAGGCGAAA 


TGCTCGTAGC 


AGTGCCACAC 


CAAGATGTGT 


1#\/\1 Inl lvjv». 


ALiATATACGC 


420 


AATAAAACAG 


GATATGATGT 


GATGGCACAT 


TTAACAATGG 


nn 1 1 1. x 1 LAL 


TAAAGGTCTA 


480 




GTTCCAATTA 


CATCATTATC 


CTTTGGATAT 


AAACAGGGTC 


AH_J. IVJAACV. 


GATATTTATT 


540 


75 


TTAGGTAAAA 


ATAATAAACA 


AAAAAGAGAT 


CCAAACGTGA 


''P^P^^ i ^2^^^^ HP* CI* 
1 1 rWjV^Vj HI 


AuAAvjCAAAT 


600 




CGTCGTAAAT 


TTAATAAAGA 


TAAATAGAAA 


TAATTGGATA 




TGATAATGAA 


660 




TTTATTTTAC 


AATCCTAAAT 


ATGTAGGAGA 


TGTCGCATTT 




AACCAGTTGA 


720 


20 


AGGTGAATTA 


AACTACAATA 


AAAAAGGTAA 


TGTTGTTGAA 




AAGGTAATGT 


780 




TGTAGGTTAT 


AATATTTTTG 


AAATTTCAAA AGATATAACA 


Al iAjAAoAAA 


AAGGTCATAT 


840 




TAAATTAACT 


GATGAACTTG 


TAAATGTATT 


CCAAAAGCGT 


Al 1 I LAbAAG 


CTGGTTTTGA 


900 


25 


TTATAAATTA 


AATGCTGATC 


TATCACCGAA 


*» XXX \J X tt\3 L X 


uuLlAwl 1\» 


AAACTAAAGA 


960 




CAAACATCCT 


GATGCAGATA 


AATTAAGTGT 


ACTAAATGTA 


AAto 1 I OviAA 


ATGACACATT 


1020 




ACAAATTGTA 


TGTGGCGCGC 


CTAACGTTGA 


AGCTGGACAG 




TTGCTAAAGT 


1080 


30 


AGGTGCAGTG 


ATGCCTAGCG 


GTATGGTAAT 


TAAAGATGCT 


VaAAl lALvj 1 Vj 


GTGl 1GCCTC 


1140 




AAGCGGTATG 


ATTTGTTCAA 


TGAAAGAATT 


GAATTTACCT 




A & TV TV TV TV TV 

aaGAAAAAGG 


1200 




TATTATGGTA 


TTAAATGACA 


GCTATGAAAT 


TGGACAAGCA 


TTtTTTGAAT 


AATTAAGGAA 


1260 


35 


GGTAGTGAAA ATATGAGCTG 


GTTTGATAAA 


TTATTCGGCG 


AAGATAATGA 


TTCAAATGAT 


1320 




GACTTGATTC 


ATAGAAAGAA 


AAAAAGACGT 


CAAGAATCAC 


AAAATATAGA 


TrACGATCAT 


1380 


40 


GACTCATTAC 


TGCCTCAAAA 


TAATGATATT 


TATAGTCGTC 


CGAGGGGAAA ATTCCGTTTT 


14 40 


CCTATGAGCG 


TAGCTTATGA 


AAATGAAAAT 


GTTGAACAAT 


CTGCAGATAC 


TATTTCAGAT 


1500 




GAAAAAGAAC 


AATACCATCG 


AGACTATCGC 


AAACAAAGCC 


ACGATTCTCG 


TTCACAAAAA 


1560 


45 


CGACATCGCC 


GTAGAAGAAA 


TCAAACAACT 


GAAGAACAAA 


ATTATAGTGA 


ACAACGTGGG 


1620 




AATTCTAAAA 


TATCACAGCA 


AAGTATAAAA 


TATAAAGATC 


ATTCACATTA 


CCATACGAAT 


1680 




AAGCCAGGTA 


CATATGTTTC 


TGCAATTAAT 


GGTATTGAGA 


AGGAAACGCA 


CAAGCCAAAA 


1740 


SO 


ACACATAATA 


TGTATTCTAA 


TAATACAAAT 


CATCGTGCTA 


AAGATTCAAC 


TCCAGATTAT 


1800 




CACAAAGAAA 


GTTTCAAGAC 


TTCAGAGGTA 


CCGTCAGCTA 


TTTTTGGCAC 


AATGAAACCT 


1860 
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AAACAAAAAT 


ATGATAAATA 


TGTAGCTAAG 


ACGCAAACGT 


CTCAAAATAA 


ACAATTAGAA 


1980 




CAAGAAAAAC 


AAAATGATAG 


TGTTGTCAAA 


CAAGGAACTG 


CATCTAAATC 


ATCTGATGAA 


2040 


5 


AATGT AT CAT 


CAACAACAAA 


ATCAATGCCT 


AATTATTCAA 


AAGTTGATAA 


TACTATCAAA 


2100 




ATTGAAAATA 


TTTATGCTTC 


ACAAATTGTT 


GAAGAAATTA 


GACGTGAACG 


AGAACGTAAA 


2160 




GTGCTTCAAA 


AGCGTCGATT 


TAAAAAAGCG 


TTGCAACAAA 


AGCGTGAAGA 


ACATAAAAAC 


2220 


10 


GAAG AG CAAG 


ATGCAATACA 


ACGTGCAATT 


GATGAAATGT ATGCTAAACA AG cGGAACgC 


2280 




TATGTTGGTG 


ATAGTTCATT 


AAATGATGAT 


AGTGACTTAA 


CAGATAATAG 


TACAGATGCT 


2340 




AGTCAGCTTC 


ATACAAATGG 


CATAGAGAAT 


GAAACTGTAT 


CAAATGATGA 


AAATAAACAA 


2400 


15 


GCGTCAATAC 


AAAATGAAGA 


CACTAATGAC 


ACTCATGTAG 


ATGAAAGTCC ATACAATTAT 


2460 




GAGGAAGTTA 


GTTTGAaTCA 


AGTATCGACA 


ACAAAACAAT 


TGTCAGATGA 


TGAAGTTACG 


2520 


20 


GTTTCGAATG 


TAACGTCTCA 


ACATCAATCA 


GCACTACAAC 


ATAACGTTGA 


AGTAAATGAT 


2580 


AAAGATGAAC 


TAAAAAATCA 


ATCCAGATTA 


ATTGCTGATT 


CAGAAGAAGA 


TGGAGCAACG 


2640 




aATAAAGAAG 


AATATTCAGk 


AAGTCAAATC 


GATGATGCAG 


AATTTTATGA 


ATTAAATGAT 


2700 


25 


ACAGAAGTAG 


ATGAGGATAC 


TACTTCAAAT 


ATCGAAGATA 


ATACCAATAG 


AAACGCGTCT 


2760 


GAAATG CATG 


TAGACGCTCC 


TAAAACGCAA 


GAGTACGCAG 


TAACTGAATC 


T CAAGTAAAT 


2820 




AATATCGATA 


AAACGGTTGA 


TAATGAAATT 


G AATT AG CAC 


CGCGTCATAA 


AAAAGATGAC 


2880 


30 


CAAACAAACT 


TAAGTGTCAA 


CTCATTGAAA 


ACGAATGATG 


TGAATGATAA 


TCATGTTGTG 


2940 




GAAG ATT CAA 


GCATGAATGA 


AATAGAAAAG 


AATAACGCAG 


AAATTACAGA 


AAATGTGCAA 


3000 




AACGAAGCAG 


CTGAAAGTGA 


ACAAAATGTC 


GAAGAGAAAA 


CTATTGAAAA 


CGTAAATCCA 


3060 


35 


AAGAAACAGA 


CTGAAAAGGT 


TTCAACTTTA 


AGTAAAAGAC 


CATTTAATGT 


TGTCATGACG 


3120 




CCATCTGATA AAAAGCGTAT 


GATGGATCGT 


AAAAAGCATT 


CAAAAGTCAA 


TGTGCCTGAA 


3180 




TTAAAGCCTG 


TACAAAGTAA 


GCAAGCTGTG 


AGTGAAAGAA 


TGCCTGGGAG 


TCAAGCCACA 


3240 


40 


CCATCATCAA 


GATCTGATTC 


ACAAGAGTCA 


AATACAAATG 


CATATAAAAC 


AAATAATATG 


3300 




ACATCAAACA 


ATGTTGaGAA 


CAATCAACTT 


ATTGGTCATG 


CAGAAACAGA 


AAATGATTAT 


3360 




CAAAATGCAC 


AACAATATTC 


AGAGCAGAAA 


C CTT CTGTTG 


aTTCAACTCA 


AACGGAAAT A 


3420 


45 


TTTGAAGAAA 


GTCAAGATGA 


TAATCAATTG 


GAAAATGAGC 


AAGTTGATCA 


ATCAACTTCG 


3480 




TCTTCAGTTT 


CAGAAGTAAG 


CGACATAACT 


GAAGAAAGCG 


AAGAAACAAC 


ACATCCAAAC 


3540 


50 


AATACTAGTG 


GACAACAAGA 


TAATGATGAT 


CAACAAAAAG 


ATTTACAGTC 


ATCATTTTCA 


3600 


AATAAAAATG 


AAGATACAGC 


TAATGAAAAT 


AGACCTCGGA 


> CGAACCAACA 


. AGATGTTGCA 


3660 
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CCAAGTGTTT CATTACTAGA AGAACCACAA GTTATTGAGT CGGACGAGGA CTGGATTACA 3780 

GATAAAAAGA AAGAACTGAA TGACGCATTA TTTTACTTTA ATGTACCTGC AGAAGTACAA 384 0 

GATGTAACTG AAGGTCCAAG TGTTACAAGA TTTGAATTAT CAGTTGAAAA AGGTGTTAAA 3900 

GTTTCAAGAA TTACGGCATT ACAAGATGAC ATTAAAATGG CATTGGCAGC GAAAGATATT 3960 

CGTATAGAAG CGCCTATTCC AGGAACTAGT CGTGTTGGTA TTGAAGTTCC GAACCAAAAT 4 020 

CCAACGACAG TCAACTTACG TTCTATTATT GAATCTCCaA GTTTTAAAAA TGCTGAATCT 4 080 

AAATTAACAG TTGCGATGGG GTATAGAATT AATAATGAAC CATTACTTAT GGATATTGCT 414 0 

AAAACGCCAC ACG CACTAAT TGCAGGTGCA ACTGGATCAG GGAAATCAGT TTGTATCAAT 4 20 0 

AGTATTTTGA TGTCTTTACT ATATAAAAAT CATCCTGAGG AATTAAGATT ATTACTTATC 426 0 

GATCCAAAAA TGGTTGAATT AGCTCCTTAT AATGGTTTGC CACATTTAGT TGCACCGGTA 432 0 

ATTACAGATG TCAAAGCAGC TACACAGAGT TTAAAATGGG CCGTAGAAGA AATGGAACGA 4 3 80 

CGTTATAAGT TATTTGCACA TTACCCATGT ACGTAnTATA ACAGCATTTA ACnAAAAAGC 4440 

CCCATATGAT GAAAGAATGn CAAAAATTGT CATTGTAaTT GATGAGTTGG CTGATTTAAT 4 500 

GATGATGGTC CGCAAGAAGT TG 4 522 
(2) INFORMATION FOR SEQ ID NO: 40; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 751 base pairs 
30 (B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 



40 



45 



50 



TCAAG-TTTAC 


GGATACGTAT 


ATATTTTGCA 


TGACATTTAG 


TGCAATAATA 


TTCATAATTT 


60 


GCCCGTTGTT 


GATAGCTTTC 


AATGCTGTTA 


CAAAATCTAG 


GCGCTCCAAC 


CTGTTGGCTC 


120 


AATCGTTTAA 


AATCTTGATC 


TTTATGTTGA 


TAACCTTTAC 


CAGCAATATG 


CAAGTGATAA 


180 


TGACACAATT 


CGTG CAGT AT 


aatttttaca 


ACAGCATCTT 


CTCCATAATG 


CTCATATTGT 


240 


TTTGGATTAA 


TTTCAATATC 


ATGGGACTTT 


AAAAGATAAC 


GTCCGCCTGT 


TGTACGTAAC 


300 


CTTTTATTAA 


AATATGCACA 


ATGTCGAAAC 


GTACGTCCAA 


ATTTTTCTTC 


CGAAAGATTC 


360 


TCAAC CATTC 


GCTGAAGTTT 


GTCATTATTC 


ATGTGGATCA 


ATCATCGTTA 


ATGATACTTT 


420 


GTCTTTATTT 


TTGTCAATAC 


TGTAAATCCA 


AACGTCAACG 


ATATCACCAA 


CACTGACAAT 


480 


ATCCATTGGA 


TTTTTTACGA 


ACTTCTTAGA 


AAGTTTCGAA 


ACATGGACAA 


GTC CATCTTG 


540 



55 
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TTTCATTCCT TCTTGTAAAT CTTCAATTGA TAGCACATCG GATTTAAGGA TTGGTGTTTC 660 

AAACTCGTCC CTTGGATCTC GATTAGGTGC GTTCAAGGAT TTAATAATAT CCTCTAATGT 72 0 

5 AGGTACACCG ACTTGTAATT CAATCGCCAG T 751 

(2) INFORMATION FOR SEQ ID NO: 41:' 

(i) SEQUENCE CHARACTERISTICS: 
w (A) LENGTH: 1076 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

15 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 

TCTCCAGCTT TAACTTGATC TGGCACTTTA ACAATTGTCT GATCCATACA TACG CGACCA 60 

20 ATAACTTCGC ATTGATGACC ATTTACATTT ACAAAGCTAC CTTGCATTAT GCGTAAATGG 120 

CCATCTGCAT ATCCAATAgG TAACAATGCT ATTGTAGTTG GGTCAGTAGC TGTATAAGTT 180 

G CACCAT AAC TTACAGACTC ACCCGCTTGT AGCGT CTTTG TTTGAACTAC ATTAGCAATT 24 0 

25 AATTG CACAC TTGGTTTAAG GTGTACTTTA ACTTTTTGCT GTACATACTC TGATGGATAA 300 

TATCCATAAA GGGAAATTCC TGGTCTTATT GCATTACAGA ATTGGCAATC CATTAATAGA 360 

GAGCCTGCTG AGTTCTGACA ATGTATATAT TCAGGTTTAA TTGCTTCATT GAC CATATCT 420 

30 TTAAAACGTT GATATTGTTC AGTTGTCATA TCTCCTGGTT CGTCAGCACA GGCAAAGTGT 4 80 

GTAAACACGC CTTCAAATAC AAGTTG CTCA TATTGTTGAA TGATTTCAAT CACTTCTTGA 540 

TACGTTTTAG TATCTTTAAT ACCTAAACGT CCCATTCCTG TATCTAATTT AATGTGCAAC 600 

55 CATAACTTTT TCTCTTGCTC ACCAGAAATG TTTTT AATTG CTTCTTTCAA CCACTGTTTA 660 

GACGGAACCG TTAAGGCAAC TCGGTGTTGT ATCGCTTTAT CAATATCTTT AGCTGGTAAC 72 0 

ACACCTAAGA CTAAAATTTT AGCAGTAATC CCATGCATTC TAAGTTCTAT CGCTTCATCT 780 

AACGTTGCTA CAGCAAAAAA TGTGGCGCCA TTTTCCATTA AATGACGTGC TACTTTAACA 840 

CTACCTAGTC CATAGGCATT GGCTTTAACG ACAGCCATCA CTGTTTTATT TGGATGCAAT 900 

GTACTGAATA CTTTG AAATT TGATGCAACA GCGTTTAAAT CTACATTCAT ATACGCAGAT 960 

CTATAATATT TATCCGACAT ATTACTTCCT CCTGTAATTC CCACACGTTT TAAAACTAGA 1020 

TCTTAATTAT CATTGTATAA CAAATTTAAA ATG CTGACTT TTCTAAAACA ACTTGG 1076 
(2) INFORMATION FOR SEQ ID NO: 42: 



40 



45 



SO 



55 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 2930 base pairs 
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(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi> SEQUENCE DESCRIPTION: SEQ ID NO: 42: 

TGACCACAAT GCCCAATACA ACCATCCCAT GGTAAAGCCA AGAGATGAGT CAATAAAGCG 6 0 

TGTTGAATAA GAGCTGAATG AACCTGATAC TGGATAAAAT GTTGCCAACT CTCCAATTGA 120 

TGACATTAAG AAATATAGCA TGACACCAAT AACAAGATAA GCGAGTATAG CGCCTCCAGG 180 

ACCAGCTTGA GAAATGATAT TACCAGTAGC TACAAATAGA CCAGTCCCAA TTGCACCACC 24 0 

1S TATAGCAATC ATGGAAATGT GTCTTGAGTT AAGACTACGG TTCATTTTAT TATCTTCCAT 300 

ATTTAGTCTC CCATCTATTT AAATATACCC ATTATTGTAA GCTTTTTAAG TGTACTATTC 360 

AATAACTATT TAGTACTGTA AAGCGAAAAA ATTAAAATTT TCTGATTTTT TAATCATCTT 420 

20 GAGCATGTTT AATTGTAATT TTGATGGGGT TAAATTATAA TATGTATTAA ATTATAATTA 4 80 

TnATAAATTG TGGAGGGaTG ACTATGTCAC AACAAGACAA AAAGTTAACT GGTGTTTTTG 540 

GGCATCCAGT ATCAGACCGA GAAAATAGTA TGACAGCAGG GCCTAGGGGA CCTCTTTTAA 600 

2S TGCAAGATAT TTACTTTTTA GAGCAAATGT CTCAATTTGA TAGAGAAGTA ATACCAGAAC 66 0 

GTCGAATGCA TGCCAAAGGT TCTGGTGCAT TTGGGACATT TACTGTAACT AAAGATATAA 720 

CAAAATATAC GAATGCTAAA AtATTCTCTG AAATAGGTAA GCAAACCGAA ATGTTTGCCC 780 

GTTTCTCTAC TGTAGCAGGA GAACGTGGTG CTGCTGATGC GGAcGTGACA TTCGAGGATT 84 0 

TGCGTTAAAG TTCTACACTG AAGAAGGGAA CTGGGaTTTA GTAGGGAATA ACACACCaGT 900 

ATTCTTCTTT AGAGATCCAA AGTTATTTGT TAGTTTAAAT CGTGCGGTGA AACGAGATCC 960 

TAGAACAAAT ATGAGAGATG CACAAAATAA CTGGGATTTC TGGaCGGGTt TCCAGAAGCA 1020 

TTGCACCAAG TAACGATCTT AATGTCAGAT AGAGGGATTC CTAAAGATTT ACGTCATATG 1080 

CATCGGTTCG GTTCTCACAC ATACTCTATG TATAATGATT CTGGTGAACG TGTTTGGGTT 114 0 

AAATTCCATT TTAGAACGCA ACAAGGTATT GAAAACTTAA CTGATGAAGA AGCTGCTGAA 1200 

ATTATAGCTA CAGATCGTGA TTCATCTCAA CGCGATTTAT TCGAAGCCAT TGAAAAAGGT 1260 

GATTATCCAA AATGGACAAT GTATATTCAA GTAATGACTG AGGAACAAGC TAAAAACCAT 1320 

AAAGATAATC CATTTGATTT AACAAAAGTA TGGTATCACG ATGAGTATCC TCTAATTGAA 13 80 

GTTGGAGAGT TTGAATTAAA TAGAAATCCA GATAATTACT TTATGGATGT TGAACAAGCT 14 4 0 

SO GCGTTTGCAC CAACTAATAT TATTCCAGGA TTAGATTTTT CTCCAGACAA AATGCTGCAA 1500 

GGGCGTTTAT TCTCATATGG CGATGCGCAA AGATATCGAT TAGGAGTTAA TCATTGGCAG 156 0 
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GGTCAAATGC GCGTAGTTGA CAATAACCAA GGTGGAGGAA CACATTATTA TCCAAATAAC 16 80 

CATGGTAAAT TTGATTCTCA ACCTGAATAT AAAAAGCCAC CATTCCCAAC TGATGGATAC 174 0 

GGCTATGAAT ATAATCAACG TCAAGATGAT GATAATTATT TTGAACAACC AGGTAAATTG 1800 

TTTAGATTAC AATCAGAGGA CGCTAAAGAA AGAATTTTTA CAAATACAGC AAATGCAATG 1860 

GAAGGCGTAA CGGATGATGT TAAACGACGT CATATTCGTC ATTGTTACAA AGCTGACCCA 1920 

GAATATGGTA AAGGTGTTGC AAAAGCATTA GGTATTGATA TAAATTCTAT TGATCTTGAA 1980 

ACTGAAAATG ATGAAACATA CGAAAACTTT GAAAAATAAA TTTGATATGT AGTTTCTATA 204 0 

TTGCGTAGTT GAGCAGTTTA TGATATCATA ATAAATCGTA AAGATTCCTA ACAAGAGAGG 2100 

GTGTTTAACG TGCGCGTAAA CGTAACATTA GCATGCACAG AATGTGGCGA TCGTAACTAT 216 0 

ATCACTACTA AAAATAAACG TAATAATCCT GAGCGTATTG AAATGAAAAA ATATTGCCCA 2220 

AGATTAAACA AATATACGTT ACATCGTGAA ACTAAGTAAT TCTTATCATT CAAATACGAC 228 0 

GATTTGAAAA TAAAGCGGGC TTACCTATTA TATTGGGGAG CTCGCTTTTT TATGAAATTT 234 0 

TTGTGAAGAG TGATTAATGG ATTGAGTTTC ATCGGTAGAA CAATATATGA TTATATTAGT 24 0 0 

TGTTACTTTA TTAAAaTTTG AGAATATTTA TAGAAGGAAA T AGATT ACT G ATTTTATAAA 24 60 

GTCACTTTGT TAGCGAATGC TTGAAAGAGT ATTTAATATA GTAGAATTTA AAATTTCAAA 2520 

GCGGAATTTA ATAAGTACGA AGTAGTTCTG GGTATGTTTT AT AAATGTT C GATAATACAC 258 0 

TTTAATCTTA AATATGATGG TTTAGAAAAT GATTTAACAA AGAAATGAaA CTTTACTGTT 264 0 

GAATTATGTG AGGATTGTGT TATTATATAA ATCGTAATAA TTACGATTTG ATAAAAAGTG 27 00 

AGGTAACTAT ATATGGCTAA GAAATCTAAA ATAGCAAAAG AGAGAAAAAG AGAAGAGTTA 2 76 0 

55 GTAAATAAAT ATTACGAATT ACGTAAAGAG TTAAAAGCAA AAGGTGATTA CGAAGCGTTA 2820 

AGAAAATTAC CAAGAGATTC ATCACCTACA CGTTTAACTA GAAGATGTAA AGTAACTGGA 288 0 

AGACCTAGAG GTGTATTACG TAAATTTGAA ATGTCTCGTA TTGCGTTTAG 2930 
40 (2) INFORMATION FOR SEQ ID NO: 43: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 3606 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
45 (D) TOPOLOGY: linear 



25 



30 



SO 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 
CTTCTTGCCA TGGCTCTCTT TATTTAAAAA TGCTTCCAAC TTGTCCATTT GATTGTTTCT 6 0 
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TTATAAAAAA CTAAT TTTAC AAATGCTTTT GCGTTCTTAC AAAAAATGCA TTTGACTATT " 180 

ATTATAATAA GCGTATAATT GTCGCATATT ATTTTTTGTA TTTTTGGCAA TAACGAAGGA 24 0 

GTATTTATGA ATAAAGACAA GCAATTGCAC AACGACAAAA TCAATCTATC CCAATTAGTC 300 

TTATTAGGGT TAGGCTCTTT AATAGGATCT GGTTGGCTAT TTGGTGCGTG GGAAGCATCA 3 60 

TCAATAG CTG GACCAG CAGC AATCATATCA TGGGTTCTTG GATTCCTAGT CATTGGAACC 42 0 

ATTGCCTATA ACTACATTGA AATCGGCACA ATGTTTCCTC AATCAGGTGG CATGAGTAAC 4 80 

TATGCCCAGT ATACACATGG CTCATTATTA GGCTTTATTG CTGCTTGGGC GAATTGGGTG 54 0 

TCTTTGGTGA CAATAAT AC C TATCGAAGCT GTGTCAGCTG TT CAATAT AT GAGTTCTTGG 600 

CCGTGGCATT GGGCGAAACC AATGAGATAT TTAATGGAAA ATGGCTCTAT TAGCACATAC 66 0 

GGATTGCTAG CTGTATATCT CATCATTGTT ATTTTTTCAT TATTAAACTA TTGGTCCGTA 720 

AAACTTTTAA CATCATTTAC GAGTTTAATT TCTGTATTTA AATTAGGCGT ACCCATGTTA 780 

ACCATCATCA TGTTGATGCT ATCAGGATTC GACACTTCAA ATTACGGCCA TTCGGCAAGC 84 0 

ACATTTATGC CTTACGGAAG TGCACCGATT TTTGCTGCAA CAACAGCATC AGGGATTATT 900 

TTTTCATTCA ATTCATTCCA GACAATTATT AATATGGGTT GAGAAATTAA AAATCCTGAA 960 

AAAAATATCG CAAGAGGCAT CGCTATCTCA CTGTCAATCA GTGCAGTGTT GTACATCATT 1020 

TTACAAAGTA CGTTTATCAC TTCTATGCCT CAATCAATGT TACAACATAG TGGATGGAAT 1080 

GGCATCAACT TCAATTCACC ATTTGCTGAT TTAGCTATCT TATTAGGAAT TAATTGGCTC 114 0 

GCAATTTTAC TATACATTGA AGCTTTTGTA TCACCATTCG GTACTGGCGT GTCATTTGTC 1200 

GCCGTTACAG GTCGAGTTTT ACGAGCAATG GAGAAAAATG GACATATCCC TAAATTTCTT 126 0 

3$ GGGAAGATGA ATGAAAAATA TCATATCCCA CGTGTAGCAA TCATCTTTAA TGCCATCATT 1320 

AGTJSfrGATTA TGGTTACATT ATTTAGAGAT TGGGGTACGC TAGCAGCAGT TATTTCTACT 1380 

GCAACTTTAG TAGCCTATTT AACTGGCCCA ACGACAGTGA TTGCATTAAG AAAAATGGGA 14 4 0 

40 CCAACAATGA CTCGTCCATT TAGAGCAAAA ATTTTAAAAG TAATGGCACC ATTATCATTT 1500 

GTATTAGCTT CATTAGCTAT ATATTGGGCA ATGTGGCCAA CAACGGCTGA AGTTATTTTA 156 0 

ATCATTATAC TTGGATTACC AATCTACTTC TTCTATGAAT ATCGTATGAA TTGGCGTAAT 1620 

45 ACAAAGAAAC AAATTGGTGG TAGCTTATGG ATTATTGTAT ATTTAATCGT GCTATCAATA 1680 

CTGTCATTTA TAGGAAGCAA AGAATTTAAA GGCTTAAATA TGATTCACTA TCCATTTGAC 174 0 

TTTATCGTTA TTATTATTGT GGCACTTATC TTCTATTACA TCGGTACAAC GAGTTCATTT 1800 

GAAAGCGTCT ATTTCCGTCG C GCAACACG A AT CAAT ACGA AGATGCGTGA GTCACTAAAT 1860 
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50 



CACACACATT 


AACCAACCAT 


TGATTTCAAC 


ATCTTGGTTG 


GTTTTTTATT 


TTGAAAATCG 


1980 


GTTATAAATA 


ACTAACATAA 


CAAGATGATG 


AT CAGGCTGG 


GACATAAATC 


AATGTTCTAT 


2040 


GCTCTACGAA 


gTTATATTGG 


. CAGTAGTTGA 


CTGAACGAAA 


ATGCGCTTGT 


AACAAGCTTT 


2100 


TTTCGATTCT 


AGTCAGGGGC 


CCCAACACAG 


AGAATTTCGA 


AAAGAAATTC 


TACAGGCAAT 


2160 


GCAAGTTGGG 


GTGGGACGAC 


GATAAAGAAA 


XACTTTTTCT 


ATAGAAATTA GTATytCTTA 


2220 


TGCATGAGTT 


TTACTCATGT 


ATTCATATTT 


TTAAGTACAC 


ATTAGCTGTG 


GCTAATGTAT 


2280 


AAGAACCACT 


ACATAATAAA 


TCATTTGTGG 


CTCTTTATCA 


TTTCTGTCCC 


ACTCCCGTAG 


2340 


AAGTACATCA 


TATAATGCTG 


AAAATGGTTT 


GAGTTAAAAC 


AGATATCAAG 


CTCGTCTGAT 


2400 


TCAGTCACAA 


AATTGTCTTG 


TTATACTTGT 


CACCTATCAT 


CTATAGACCG 


TGGTATGATT 


2460 


AAATTGGGGA 


TGATAAAGGA 


GGTTAATAAA 


TATGAAGATT 


AATACTACAG 


GTGGTCAAAT 


2520 


TCATGGTATT 


ACACAAGATG 


GTTTAGATAT 


CTTCTTAGGC 


ATTCCTTATG 


CAGAACCACC 


2580 


AGTTCATGAC 


AATCGCTTTA 


AACATTCTAC 


GTTAAAAACA 


CAATGGTCAG 


AGCCAATTGA 


2640 


TGCAACTGAA 


ATACAACCCA 


TCCCACCGCA 


ACCAGACAAC 


AAATTAGAAG 


ATTTTTTCTC 


2700 


CTCACAATCT 


ACAACTTTTA 


CTGAACATGA 


AGACTGTTTA 


TATCTAAATA 


TTTGGAAACA 


2760 


ACATAATGAT 


CAGACGAAGA 


AACCTGTCAT 


CATTTATTTT 


TATGGTGGTA 


GTTTTGAAAA 


2820 


TGGTCATGGT 


ACAGCCGAAC 


f CTATCAACC 


GGCACATTTA 


GTACAAAATA 


ACGACATTAT 


2880 


CGTTATTACA 


TGCAATTATC 


GTTTAGGCGC 


ATTAGGATAT 


TTAGACTGGT 


CATATTTTAA 


2940 


TAAAGATTTT 


CATTCCAATA 


ATGGCCTTTC 


AGATCAAATC 


AATGTCATAA 


AATGGGTGCA 


3000 


TCAATTTATT 


GAATCCTTCG 


GTGGCGACGC 


TAATAACATT 


ACTTTAATGG 


GTCAGTCTGC 


3060 


AGGCAGTATG 


AGCATTTTGA 


CTTTACTTAA 


AATAC CTGAC 


ATTGAGCCAT 


ACTTCCATAA 


3120 


AGTQGTTCTA 


CTAAGTGGCG 


CACTACGATT 


AGACACCCTT 


GAGAGTGCAC 


GCAATAAAGC 


3180 


ACAACATTTC 


CAAAAAATGA 


TGCTCGATTA 


TTTAGATACA 


GATGATGTTA 


CATCATTATC 


3240 


GACAAATGAT 


ATTCTTATGC TGATGGCGAA gcTAAAACAA TCTCGAGGAC 


CTTCTAAAGG 


3300 


GCTTGATTTA 


ATATATGCGC 


CTATTAAAAC 


AGATTATATA 


CAAAATAATT 


ATCCAACAAC 


3360 


GAAACCAATT 


TTTGCATGTT 


ATACAAAAGA 


TGAAGGCGAT 


ATTTATATTA 


CTAGTGAACA 


3420 


GAAAAAATTA 


TCGCCGCAAC 


GCTTTATCGA 


CATTATGGAA 


TTAAATGATA 


TTCCTTTAAA 


3480 


ATACGAAGAT 


GTTCAGACGG 


CGAAGcAACA 


ATCTTTAGCG 


ATTACACATT 


GTTATTTCaA 


3540 


ACAGCCGATG 


aAGCAATTTT 


TACmACmACT 


CAATATACmA 


GATTCCAACC 


GCACCAACTA 


3600 


TGGCTT 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15109 base pairs 
<B) TYPE: nucleic acid 
(C) STRAND EDNESS : double 
5 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

10 GAAATTAAAA AAGCAATTGG nACAAGATGC AACAGTGTCA TTGTTTGATG AATTTGATAA 60 

AAAATTATAC ACTTACGGCG ATAACTGGGG TCGTGGTGGA GAAGTATTAT ATCAAGCATT 120 

TGGTTTGAAA ATGCAACsAG AACAACAAAA GTTAACTGCA AAAGCAGGTT GGGCTGAAGT 180 

15 

GAAACAAGAA GAAATTGAAA AATATGCTGG TGATTACATT GTGAGTACAA GTGAAGGTAA 24 0 

ACCTACACCA GGATACGAAT CAACAAACAT GTGGaAGAAT TTGAAAGCTA CTAAAGAAGG 300 

ACATATTGTT AAAGTTGATG CTGGTACATA CTGGTACAAC GATCCTTATA CATTAGATTT 360 

20 

CATGCGTAAA GATTTAAAAG AmAAATTAAT TAAAGCTGCA AAATAATTCA GCTATATAAG 42 0 

TTAGTGAAAT GAGAGTCTGA AACATATCAA T C TTTTGATA TTGTATTAGG CTCTTATTTT 4 80 

TAT AG CT AG A AAGTTAGATA TTTGTATTTT TTTAAATAAT AAGTGCCGTT GTTATOGTTC 54 0 

25 

AATTTAATTA ATGATAGATT AGTATTATTA TAGCTAAAGT AGTATACCTG AGAAAATAGC 600 

T CAATGT AT C TCTTTATTAA TAAGTTATAT CATAATTATT TTAGTGCATA CTTTATGGAA 66 0 

GGGATATCAG GGAATGGCTT TCAATTAAAG AAGAGGTTTA AAAGGATTAC AACAGAATGT 720 

30 

TATGATTTTG TAGAAAGATA TATAACAACG TTTTATAAAA. ACATAATATT GTTAATGGAA 780 

AATGAAATGT AAGGGGGATT TCGAGTGACT AAGAAAGTTT ATTTTAACCA CGATGGTGGT 84 0 

35 GTAGATGATT TAGTATCTCT ATTTTTATTA TTACAAATGG AAAACGTTCA ATTGATAGGG 900 

GTOfiSTACAA TTGGTGCTGA TTGTTATTTA GAGCCATCTT TGAGCGCATC AGTAAAAATT 960 

ATTAATCGTT TTTCAAATGA AGATATTCAA GTTGCGCCAT CATATGAACG AGGAAAAAAT 1020 

40 CCATTTCCTA AAGAATGGCG TATGCATGCC TTTTTTATGG ACGCATTGCC AATTTTAAAT 108 0 

GAG CCAGTCA AACATGTTGC TTCAAATGTG AGCGACAAAG AAGCCTTTGA AGACATTATT 1140 

CAAACTTTAA AGAGACAATC AGAAAAAGTA ACATTATTAT TTACAGGCCC GCTTACAGAT 1200 

45 TTAGCAAAAG CACTACAAAA AGATTCATCT ATCGTTCAGT ATATAGAAAA ATTAGTTTGG 126 0 

ATGGGTGGCA CCTTTTTACC AAAAGGAAAT GTTGAAGAAC CTGAGCATGA TGGTTCTGCA 132 0 

GAATGGAATG CATATTGGGA TCCAGAAGCG GTTAAAATTG TTTTTGATAG CGATATAGAG 13 80 

50 ATTGATATGG TTGCTTTAGA AAGTACGAAT CAAGTACCGC TAACGTTAGA TGTTAGACAA 144 0 
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GTACCACCAT TAACACACTT TATAACAAAT 
ACTG CTTAT A TTGGTAACAA GGACTTGGTT 

5 AGTTATGGAC CAAGTCAAGG TAAGACATTT 

ATAAATCATG TAGATAACAA CGCATTTTTT 
AATTAACAGC TGTGTAGAAT AATTAAGGTT 

jo TTTTCATTTC TTAAAGTTTA CAATGGTGCT 

TAAAAAATGA CAACAAAACA GTTAGTATAT 
TTAGGATTGG TACCGGTAAT TCCACTACCA 

75 ATTGGTATTT TCTTAGCAGG TGCGATTTTA 

GTCTTTTTAT TATTAGTAGT TGCTGGCTTG 
GGTGTATTCG CAGGTCCTTC AGCAGGGTTT 
ATTGGGGCGA TTCGAGATAG ATTCATCAAT 
ATTTTAGTTT TTGGTGTTAT AG CATTAGAT 
ATTAACATAC CATTTACGAA AGCTATTTCA 

25 

TTAAAAGCAA TTGTAGCAAG TTTGATTGGT 
CAAATTATGG GAATAAAATA ATCATATTTA 
GAAATTTATA AAAGTGAAAG GAGTAGGTGT 

30 

ATTGTAACGG CACTATATTT GAAAATGACG 
TTCCACAACG ATA CACAAGT AACACATGGA 
GGCTGAAAGA CTTAGAACGT CAACATCAAT 

35 

CGTTTAGTTT CCCGGAAAAT GAACAACTTG 
TGAATTTTGA ACTAGGTATT ATGGAATTGT 
TGCCGCGTAA CTCTGACGTT GAAATTGCCA 

40 

TAAAAGTTGC ATATCAGTTT AGTTTGCCAT 
AAATGGTAAG GGAACATTAT CAAAAAGATG 

45 ATGAAC CTAT TGGCGTTGTA GATGTCATTG 

TTGGTGTATT AGAACAATTT CGGCAC CAAG 
GTGAATACGC CATATCAAAA AATCACAAAC 

SO CAGCAAAAGA TATGTATGCA AAGCAAGGTT 
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TCTACTTACT TTTTATGGGA TGTTTTAACG 1560 

CATTCAATTG AGAAAAAAGT CGATGTAATA 1620 

GAGTGTAAAG ATGGGCGCAA AATTAATGTC 16 80 

GATTATATAA CTGCACTTGC TAAAAAAGTA 174 0 

TTAATTTATA TAGAACAACT TATTGTAAAC 18 00 

ATAATAATGG TCATGAAATA CGAAAGGAAG 1860 

ACAGCTTTAA TGACAGCGAT TATCGCTATT 1920 

TTTTCTTCAG TACCAATTGT ACTTCAAAAC 1980 

GGACGTAAAT ATGGCACATT AAGTGTTATC 204 0 

CCATTGTTAT CAGGTGGTCG CGGTGGCATC 2100 

TTACTATTAT ATCCAGTTGT AGCATTCATG 216 0 

GAAATTAATT TCTGGATTTT ATTCGTTGGT 2220 

GTTATTGGTA CATTGATTAT GGGCATGATT 2280 

ATTTCATTAG CTTATTTGCC TGGTGATATA 2340 

ACAGCTTTAC TTAATCACTC GCAGTTTCGT 24 00 

AGATAGTAAA GTAATTGAAT AAGTTGCTTT 24 60 

CAATGGCTAG TATAAGTATG TCAGATATAT 2520 

ACGAGCAGTT GATTTATTTA ACGCCTTCTT 2580 

TATATAAAAA GACGCCTACC CAAGAGCGAT 2640 

TACATACAAA TCAAGGTTCA AATCATTATG 2700 

ATAATCATTG GATGGCTATG TTTAAAGATA 2760 

ATGC CAT AG A AAGTGATGCG CTTGCCAATT 2820 

TCGTTGACGA GTCG CATATA GATGCCTATT 28 80 

TTGGAAAAGA CTATGCAGAT GCACATGAAG 2940 

TGATTAAACG CTTAGTAGCT TATTTAAATA 3000 

AAAGTGAAAA TTACATTGAA TTAGATGGAT 3 06 0 

GAATTGGATC TACAATTCAA TCGTTGATAG 3120 

CAATCATATT AGTTGCAGAT GGTGAAGATA 3180 

ATGTCTATCA ATCGTTTTGT TATCAAATAT 3240 
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TAAGCTGGTT TCGAGTAGAA ATCAACTTAC TGCTTTTTAA- ATTGTTTTGA GCTACTTATA 336G 

CTTATAAAAA TAGTGCGTTT AAATTGTTGA TTCATGTAGA ATATCGTTCA TTATGACACA 3420 

CTATAATGAA TATGTTATTG TTCAGAATCA ATGATACGTT CTGGATGACT GTATATATTA 3430 

AAGCCACCAT TTCGAATAAA TCCAACTGCC GTAATATTTA GGTCATTAGC TAAGGTTACA 3540 

GCAAGCGTTG TCGGAGCTGA TTTAGATAAA ATGACGCCAA CACCAATTTT TGCGG CTTTA 3 6 00 

ATTAAAATTT CTGATGAAAT ACGTCCACTA AAAATTAATA CTTTATCTCG GACAGTAATA 3 6 60 

TGTCGCTGAA TACAAAATCC ATATAATTTA TCTAGAGCGT TATGTCTACC AATGTCTTGT 3 720 

75 CGATGTACAA AAAATGTCAA ACCATCGCTT ATAGCAGCAT TATGTAAGCC ACCTGTTTCT 3780 

TGGTAAATAT GACTTGCACT TTGTAATCGA GTCATCATGT TAATAATTTG CATTGGAGTT 3 840 

AAAGTGATTT TAGACATAGA TGTTTTAGCG ATAGCAGCAT CATTTTGAAA ATAAAACTCA 3 900 

20 CGACTCTTTC CGCAACAAGA TGCAATCATT CGTTTTGTGG AATATTGAAA GCGATCGCCT 3 960 

AAATCTTTAT TAAGTTCAAC ATGGGCAAAA CCTTTACTAT CATCAATCAG TACAGATTTT 
AATTCATCTC GCTTTAAAAT GGCACCTTCC GAAGCCAGAA ATCCAATGAC TAACTCCTCA 
AGGTTTGTTG GACTGCATAT AACAGTCGCA AATTCTTCAC CATTCACCAT AATTGTAAGT 

GGAAATTCTG TCACATATTG ATCTGTTGTA TTGAATAATT TTCCATCTTC ATATCTAACA 4200: 

ATTGGTTGAC CTAAAGATAC ATCTTTGTTC ATTATCTAAC CCCTTTAATT AGCTTAAACT 4260~ 

TTATTTTAAA GCAATTTGCT TAAAATTTTA ACATATTTGC TTAAGTTTGA AATTTGATTG 4 3 20 - 

ATAAAAATTA ATAGCGAGCA ATCTGTTTGA TTTAAATTGA ATTCGAGAAT ATACATACTA 4 3 80* 

3S GGGCATCAAT TAATAAATAT CAATCTTATG CAAATTTGAC AATTGTTTGA AT CAAT AT AT 4440- 

AAACAGGCAA CGGTTCTTTT CAAATATAAT AGTAAGTGTA TAATGAAAAT GTAAATATTA 4 500~ 

TTAAAAATGG GGGTTCACTC AATGAAATTG AAACGTTTAT TTGCTGTTGT GATTGCAATG 4560 

40 CTTTTAGTAT TAGCTGGTTG CTCTAATTCT AACGATAATA ATGAAAGTAA AAAAGATGAC 4620 

GCAGACAATG GTAAGAAACA AGAGATTCAA GTTGCAGCGG CAGCAAGTTT AACAGATGTA 
ACCAAGAAAT TAGCTTCAGA ATTTAAAAAA GAGCATAAAA ATGCTGATAT TAAATTTAAC 

45 TATGGTGGAT CAGGGGCATT AAGAAAACAA ATTGAATCAG GCGCACCTGT TGACGTATTT 4 800 

ATGTCTGCAA ATACTAAAGA TGTAGATGCA TTAAAAGACA AGAATAAAGC GCATGATACA 4 860 

TATAAATATG CGAAAAATAG TCTAGTATTA ATTGGTGATA AAGATTCAAA TTACACTTCA 4 920 
GTAAAAGACT TAAAAGACAA TGATAAATTA GCATTAGGTG AAGTGAAAAC TGTACCAGCA 



25 



30 



SO 



4020 
4080 
4140_* 



4680 
4740 



4980 



GGAAAATATG CGAAACAGTA TTTAGATAAC AATAACTTAT TTAAAGAAGT CGAAAGTAAA 5040 
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CAAGGTTTTG TGTATAAAAC TGACTTATAT AAACAAAATA AAAAAATTGA TACTGTAAAA 5160 

GTAATTAAAG AAGTAGAACT TAAGAAGCCA ATCACATACG AAGCTGGTGC TACATCAGAT 5220 

AGTAAATTAG CAAAAGAGTG GATGGAATTC TTAAAATCAG ATAAAGCTAA AGAAATACTA 5280 

AAAGAATACC ACTTTGCAGC ATAAGGAGTT GTAATCCATG CCTGACTTAA CACCTTTTTG 5340 

GATATCAATA CGAGTTGCTG TAATCAGTAC GATTATTGTA ACGGTTTTAG GTATTTTTAT 54 00 

ATCTAAATGG TTGTATCGTC GTAAGGGTTC GTGGGTTAAA GTATTGGAAA GTTTATTGAT 5460 

ATTACCTATT GTTTTGCCGC CAACGGTATT AGGTTTTATT CTATTAATCA TCTTCTCGCC 5520 

AAGAGGACGA ATCGGTCAAT TCTTTGCGAA TGTACTACAT TTACCTGTAG TGTTCACTTT 5580 

GACAGGTGCT GTGATAGCAT CTGTCATTGT TAGTTTTCCA CTAATGTATC AACATACTGT 564 0 

GCAAGGCTTC AGAGGTATAG ACACGAAAAT GATTAATACA GCTAGAACGA TGGGAGCAAG 5700 

f t 1 /^ ji^^a CX\ AAA s,T >> T* r T w Ff^r^*Tr^A AATTAATTIT AGCATTAwCT AAACG CTCT A TTTTAGGAGG 5760 

TATAATGATG AGTTTTGCTC GTGCATTAGG TGAGTTXGGT GCTACATTAA TGGTTGCAGG 5820 

ATATATTCCA AATAAAACGA ATACACTACC TTTAGAAATA TACTTCTTAG TGGAACAAGG 5880 

2S TAGAGAAAAT GAAGCGTGGT TATGGGTATT AGTGCTAGTC GCATTCTCTA TTGTGGTTAT 5940 

ATCTACAATT AATTTATTGA ATAAAGATAA ATATAAGGAG GTCGACTAGA TGCTTAAAAT 6000 

CAATGTGAAA TATCAATTAA AGAACACTTT AATTCGCATC AATATAGATG ATACTGAACC 6060 

30 AAAAATTTAT GCAGTTCGTG GTCCATCTGG CATTGGTAAA ACTACTGTTT TAAATATGAT 6120 

TGCCGGATTA CGTAAAGCAG ATGAAGCTAT TATCGAAGTG AATGGGCAAT TACTTACTGA 6180 

TACGGCAAAA AACGTGAATG TTAAAATTCA ACAACGACGT ATTGGATATC TGTTTCAAGA 624 0 

CTACCAATTG TTTCCTAATA TGACGGTCTA TAAAAATATT ACTTTTATGG CTGAACCATC 6300 

TGAACACATC GATCAATTAA TTCAAACTTT AAACATTGAT CATTTGATGA AACAATATCC 6360 

TATGACATTG TCAGGTGGAG AGGCACAACG TGTAGCACTT GCACGTGCAC TTAGCACrAA 6420 

ACCAGATTTA ATTTTATTAG ATGAACCTTT TTCTAGTTTG GATGATACTA CAAAAGATGA 6480 

GAGTATTACA TTAGTTAAAC GTATTTTCAA CGAATGGCAA ATACCAATCA TATTTGTGAC 6540 

ACATTCAAAC TATGAAGCAG AACAAATGGC TCATGAAATT ATTACAATTG GGTAATCATT 6600 

TATTTGCCAT TAAAGAGTTT AGAACGTATT TAAAATTGTA GAAGTGAATG CTTCTATCAG 6660 

CATTTTAATG ATGTTTTAAA CTCTTTTTTA GGGGCAGTTT TTTTGAGAGA CATTGACGCG 6 72 0 

SO CGTCATATAA TGAAAGTAAT GATAAAAAGA AAGGATAACT TAATGTGAGT CAAGAACGTT 6 780 

ATTCAAGGCA AATTTTATTT AAACAAATAG GTGAAATAGG TCAAAGCAAA ATAAATCAAA 6 84 0 
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GAGCAGGCAT TGCCAAACTA ATCATTGTTG ATAGAGATTA TATTGAATTT AGTAATTTAC 6960 

AAAGACAAAC ATTGTTTACT GAAGAAGATG CTTTGAAAAT GATGCCTAAG GTGGTTGCAG 7020 

5 CTAAAAAGCA TTTGCTAGCG TTACGTAGTG ATGTTGATAT TGATGATTAT ATTGCCCATG -7080 

TGGATTATTA TTTTTTGGAA ACACATGGAC AGGACGTTGA CGTTATTATT GATGCAACCG 714 0 

ATAACTTTGA AACACGACAA CTGATTAATG ATTTTGCATA TAAATATCGT ATACCTTGGA 7200 

W 

TTTATGGTGG TGTTGTACAG AGTACATATA CAGAAGCTGC ATTT AT AC CT GGTAAAACAC 72 60 

CTTGCTTTAA CTGTTTGGTA CCACAATTGC CAGCATTAAA TTTAACATGT GATACAGTAG 73 20 

GGGTCATTCA ACCTGCCGTG ACGATGGCAA CAAGTTTACA ATTAAGAGAT GCGATGAAAG 73 80 

IS 

TATTAACGGA ACAACCAATT GACACAAAAA TAACTTATGG CGATATTTGG GAAGGTAGTC 7440 

ATTATTCATT TGGTTTCAGT AAAATGCAAC GTTCAGACTG TACAACTTGT GGAGATGTAC 7500 

20 CAAGTTATCC GTATTTAAAC AAGAATGAAC AACGTTATGC AACATTGTGT GGTAGAGACA 7560 

CTGTACAGTA TGAAAATGCA TCAATTACAC ACGAGATTCT TGTTCAATTT TTAAAACAAC 7620 

ATCAGTTAAA TTATCGCAGT AATTCGTATA TGGTTATGTT fGAATTTAAA GGACACCGCA 76 80 

25 TTGTTGCTTT TAAAGGTGGA AGGTTTTTAA TACATGGCAT GACACGCACA TCAGATGCCA 774 0 

CACATCTAAT GAATTTATTG TTTGGATAAA AAAAGATAAG ACAAAAGGAG TGTAATATTA 7800 

TGGGCGAACA TCAAAACGTT AAATTGAATC GTACAGTTAA AGCAGCCGTA CTAACGGTAT 7860 

30 

CAGATACTAG AGACTTTGAT ACAGATAAAG GTGGTCAATG CGTGCGCCAA CTATTACAAG 7 920 

CAGATGACGT TGAAGTGAGT GACGCACATT ATACAATTGT GAAAGATGAA AAAGTAGCGA 7980 

TCACGACGCA GGTGAAGAAG TGGTTAGAAG AAGATATTGA TGTCAT CATT ACGACTGGTG 8040 

35 

GAACAGGTAT TGCACAACGT GATGTGACGA TTGAAGCAGT AAAACCACTT TTAACTAAAG 8100 

AGATAGAAGG CTTTGGGGAA TTGTTTAGAT ATTTGAGTTA TGTTGAAGAT GTTGGCACGC 8160 

GTGCATTATT GTCTCGTGCT GTAGCAGGTA CAGTTAATAA TAAATTGATA TTTTCGATTC 8220 

40 

CAGGATCAAC AGGCGCAGTT AAATTAGCAT TAGAAAAGCT CATTAAACCA GAATTAAATC 82 80 

AT CTGATTCA TGAGCTTACA AAATAATTTA TTGATTTGAT TGGCGTTGAA AATCTCCAGA 834 0 

45 TTTACCGCCA GACTTGCTTT CAAGGTAGGT TTCGCCAATA ATCATACCTT TATCAACTGC 84 00 

TTTCGTCATG TCGTAAATGG TTAAAGCCGT TGCTGATGCA GCGGTTAAAG CTTCCATTTC 84 60 

AACACCGGTT TTGCCAGTTG TAGAGACAGT TGTTTGAATG TTTAAAGTAT AAAGGGGTGC 8520 

50 ATTTGTTTCA TCCCAGCTGA AGTGAACATC TATGCCAGTC AATGGTAATG GATGGCACAT 85 80 

CGGAATAAGT GTTGATGTAT TTTTGGCAGC CATAATACCA GCGATTTGAG CAGTGTTCAA 864 0 
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AATGCTTGAA 


TGAGCGACAG 


CAGTTCTTTT 


TGTAATTTGT 


TTGTCTGATA 


CATCGACCAT 


8760 




TTTGGCGTGG 


CCTTGTTGAT 


TAATATGAGT 


AAACTCAGTC 


ATTTTACCCC 


TCCTAGTGCA 


8820 


5 


TCTAGTATAT 


CATGAAAAAA 


TAAAAGTTTT 


GGAGATGATT 


TTTAATGGTA 


GTAGAAAAAA 


8880 




GAAACCCAAT 


CCCAGTTAAA 


GAAGCAATTC 


AACGTATCGT 


TAATCAGCAG 


AGTTCAATGC 


8940 


10 


CGGCAATTAC 


GGTAGCACTT 


GAAAAAAGTC 


TAAATCATAT 


CTTAGCAGAA 


GATATTGTAG 


9000 


CTACTTATGA TATACCAAGG 


TTTGATAAAT 


CACCTTATGA 


TGGTTTTGCA 


ATTCGCAGTG 


9060 




TTGATTCACA AGGGGCAAGT 


GGTCAGAATC 


GCATTGAGTT 


TAAAGTGATT 


GATCATATTG 


9120 


15 


GTG CAGGTTC 


AGTTTCTGAT 


AAATTAGTTG 


GGGATCACGA 


AGCGGTGCGT 


ATTATGACTG 


9180 


GAGCACAAAT 


ACCTAATGGC 


GCAGATGCTG 


TTGTTATGTT 


TGAACAAACG 


ATTGAACTAG 


9240 




AAGATACATT 


TACAATTCGT 


AAACCATTTT 


CAAAAAATGA 


AAATATATCT 


TTAAAAGGTG 


9300 


20 


AAGAAACAAA 


GACAGGCGAT 


GTTGTTCTAA 


AAAAAGGACA 


AGTAATTAAT 


CCAGGGGCTA 


9360 




TCGCGGTCCT 


TGCAACATAT 


GGCTATGCAG 


AGGTTAAAGT 


TATTAAGCAA 


CCGAGTGTCG 


9420 




CTGTTATTGC 


AACAGGAAGC 


GAATTATTAG 


ATGTTAATGA 


TGTATTAGAA 


GATGGGAAAA 


9480" 


25 


TTCGTAACTC 


TAATGGCCCA 


ATGATTCGTG 


CCTTAGCAGA 


AAAATTAGGT 


CTTGAAGTTG 


9540 




GTATTTACAA 


AACACAAAAA 


GATGATTTAG 


ATAGTGGCAT 


CCAAGTCGTT 


AAAGAAGCTA 


9600 




TGGAAAAACA 


TGATATCGTT 


ATTACAACGG 


GCGGAGTTTC 


TGTTGGAGAT 


TTTGACTATT 


9660 


30 


TACCTGAGAT 


TTATAAGGCT 


GTAAAGGCGG 


AAGTGTTATT 


TAATAAAGTA 


GCAATGCGTC 


9720 




CTGGTAGCGT 


AACAACGGTT 


GCATTTGTAG 


ATGGaAAGTA 


TTTGTTTGGa 


TTATCTGGAA 


9780 




ATCCATCAGC 


TTGTTTTACA 


GGATTTGAAC 


TATTTGTGAA 


nCCAGCTGTT 


AAACATATGT 


9840 


35 


GTGGCGCACT 


AGAAGTCTTC 


CCGCAAATAA 


TTAAAGCAAC 


ATTAATGGAA 


GATTTTACCA 


9900 




AGGCAAACCC 


ATTCACACGA 


TTTATACGTG 


CTAAAGCAAC 


GTTAACAAGT 


GCTGGAGCTA 


9960 


40 


CTGTAGTACC 


TTCAGGATTC 


AATAAATCAG 


GTGGGGTTGT 


AGCGATTGCA 


CATGCTAACT 


10020 


GTATGGTCAT 


GTTACCAGGA 


GGGTCACGTG 


GTTTTAAAGC 


GGGGCATACA 


GTAGATATTA 


10080 




TATTGACtGA ATCTGACGCT GCTGAAGAGG 


AACTTCTTTT 


ATGATTTTAC 


AAATTGTAGG 


10140 


45 


TTACAAAAAG 


TCTGGTAAGA 


CAACATTGAT 


GAGG CAT ATT 


GTCTCTTTCT 


TAAAGTCACA 


10200 




TGGTTATACA 


GTTGCTACTA 


TTAAACATCA 


TGGGCATGGT 


AAGGAAGATA 


TTCAATTACA 


10260 




GGATTCAGAC 


GTCGATCACA 


TGAAGCATTT 


TGAAGCGGGG 


GCAGATCAAA 


GTATTGTACA 


10320 


SO 


AGGTTTTCAA 


TATCAGCAAA 


CTGTAACACG 


TGTAGATAAT 


CAAAATCTTA 


CTCAAATTAT 


10380 




TGAAAAATCT 


GTTACAATTG 


ACACCAATAT 


CGTATTAGTT 


GAAGGCTTTA 


AAAATGCTGA 


10440 
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2S 



30 



35 



40 



45 



SO 



GAATGTTTGT 
GTTATTAAAT 
TGAAACAATT 
TAAATGAATA 
GCGTCAAAAC 
CACAAATTGG 
GAATAGGGCC 
GTAAAGATGC 
TTTGGAAAAA 
ATGAAGAAGC 
AGATATATTA 
ATTTGAAGAT 
TGTAAATGAG 
AATTCCACCG 
TTCAGTGCGA 
TAGAGTAATT 
TGCGCAATTG 
TGATAAAGGT 
GTTTTTTGTC 
TCAGTTTTTA 
TGGACGTTTT 
AG CACTACAT 
TTTGGATGTA 
TGATTTGGAC 
AAATAAAAGA 
GTAACTTTAG 
TACCTAAAAA 
AATTAGGTGT 
ATGTACTTAT 



TATAGCATTA 
AAAATTAAAA 
TGAAATCGTG 
TCAAGGTGCA 
GGAATATTTA 
AGATGAAATA 
ATTACAAATT 
CTATCGAGCA 
AGAAATTTGG 
AAAGAGGGAG 
CAAAAAGCAC 
TTATTGTT7G 
GAATTTGTAC 
GTTAGTGGAG 
TTTGGTAAGC 
AAGACATTAG 
GCAACGCAAT 
CCATTAGCAG 
GTTTCTGTTG 
GTTTCTCATC 
ATTCCAACAA 
TCTGATAATT 
AGGGATGTAG 
GCTTTAATTC 
TAAACTAGGA 
GTGTGATTAT 
TGAACTTTTA 
AAAAAAAATA 
AGCTAAATTA 



ATGTAAGGGA 
ATGATT G TGA 
ACAGAACCGA 
GTAGTTGTTT 
GAATATGAAG 
AATGAAAAAT 
TCAGATATCG 
AATGAATATG 
GAAGATGGTT 
GAATAAGAGA 
AGGAAGATAT 
AACGTTATCC 
AAAAATCGGA 
GTTAAGGGAG 
CCAAAGCTTT 
AATCAACAAA 
TTAAATATCC 
GAATTTATAC 
ATACACCAAT 
TTATTGAAAA 
TTGCATTTTA 
ACAGTTTTAA 
ATGCGCCCTC 
AAAAATTGTA 
CGTCCCATCC 
TGCATGCCTA 
ACGTTTGATG 
CGCATTACAG 
AATCAAATCG 



G CATGAAG AT 
TACACAATTA 
TACAAACAGA 
TTACCGGTCA 
CGTATATTCC 
GGCCTGGAAC 
CTGTATTAAT 
CAATTGAGCG 
CAAAATGGCA 
GATGAAGGTA 
TGTGCTTGAA 
GCAAATCAAT 
TTTCATTCAA 
CATGAAAGCA 
TGCGGAAGTG 
TATGTTCAAT 
AAATGTTGTT 
AAT CATGAAG 
GATTACTGGT 
TCATTTAGAT 
TAGTCCGAAT 
AAATGTATAT 
ATATTGGTAC 
AGCTGTTAGG 
GTGACTTACG 
AAGAGGTATT 
AAATGGCTAG 
GTGGAGAACC 
ATGGTATTGA 



TTTACAGCAT 
ACATAGAGGA 
ACAATATCGT 
TGTTCGCGAA 
AATGGCTGAA 
GATAACGAGT 
TGCGGTTTCT 
TATAAAAGAA 
AGGGCATCAA 
CTTTACTTCG 
CAAGCATTGA 
AATAAAAAGT 
CCTAATGATA 
ATAATTCTTG 
AACGGTGAGA 
GAAATTATTA 
ATAGATGATG 
CAACATCCTG 
AAAGCTGTAA 
GTCGCAGCTT 
GCATTAGGCG 
CATGAATTAT 
AAAAATATAA 
AGGTCCACAA 
GTTATCTGTG 
TGGAGATGAT 
AATCGCTAAG 
ATTGATGCGA 
AGATATTGGT 



TTGAGCAATG 
TTGAAATGAA 
GAATTCACTA 
TGGACTAAAG 
AAGAAATTGG 
ATTGTTCATA 
TCACCGCATC 
ATTGTTCCGA 
AAAGGGAATT 
CAGAAATTAA 
CTGTACAACA 
TTCAAGTTGC 
CTGTTGCATT 
CAGGTGGTCA 
CCTTTTATAG 
TTAGTACAAA 
AGAATCATAA 
AAGAAGAATT 
GCACGTTGTA 
TTAAAGAAGA 
CTATAACTAA 
CAACGGATTA 
ATTATCAGCA 
ATGGTAGAAC 
ACAGATCGGT 
TTCGTATTTT 
GTATATGCAG 
CGGGATTTAG 
TTGACTACAA 



10560 
10620 
10680 
1074 0 

loeoo 

10860 

10920 

10980 

11040 

11100 

11160 

11220 

11280 

11340 

11400 

11460 

11520 

11580 . 

11640 . 

11700 

11760 

11820 

11880 

11940 

12000 

12060 

12120 

12180 

12240 
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ATGTCAGTTT GGATGCTATT GATGATACGC TATTTCAATC AATCAATAAT CGTAATATTA 12 360 

AAGCGACTAC GATTTTAGAA CAAATTGATT ACGCGACGTC TATTGGTTTG AATGTAAAAG 1242 0 

TAAATGTTGT TATACAAAAA GGTATTAACG ATGATCAAAT CATACCAATG CTTGAATATT 12480 

TTAAAGATAA ACATATAGAG ATTCGATTTA TAGAATTTAT GGATGTTGGT AATGATAATG 12540 

GATGGGATTT CAGTAAAGTT GTAACTAAAG ATGAAATGCT TACAATGATA GAGCAGCACT 12600 

TTGAAATCGA TCCTGTAGAA CCAAAATATT TTGGGGAAGT AGCAAAATAT TATCGCCATA 12660 

AGGATAATGG TGTTCAATTT GGTTTGATTA CAAGTGTTTC ACAATCATTT TGTTCTACAT 12720 

GTACACGCGC AAGGCTGTCA TCAGATGGGA AGTTTTACGG ATGTTTATTT GCAACTGTCG 12780 

ATGGATTTAA CGTTAAAGCG TTTATTCGTT CTGGCGTGAC CGACGAAGAA TTAAAAGAAC 12840 

AATTTAAAGC TTTATGGCAA ATAAGAGATG ATCGATATTC AGATGAGAGA ACTGCTCAAA 12900 

CAGTTGCCAA TCGTCAACGT AAAAAGATAA ACATGAATTA TATTGGTGGT TAATGTGTAG 1296 0 

GGACCACTAC ATATTAAATC ATTAGAGATG TTTTAATATT TCTGTCTTAC TCCCTAAAAT 13020 

ACAATATTAT TTATTAAAGT AAAAACGGTC ATATCTATGC CAGATTTAAT AGAAATGATC 13 080 

25 GTTTTTAAAG TTTTTACAAG TTGGCGGGGC CCCAACACAG AAGCTGACAG AAAGTCAGCT 1314 0 

TACAATAATG TGCAAGTTGG CGGGGCCCCA ACATAGAGAA TTTCAAAAAG AAATTCTACA 13200 

GACAATGCAA GTTGGGGAAC GGGGCCCCAA CACAGAAGGT GACGAAAAGT CAGCATACAA 13260 

30 TAATGTGCAA GTTGGCGGGG CCCCAACATA GAGAATTTCA AAAGAAATTC TACAGACAAT 13320 

GCAAGTTGGG GATCAACGAA ATAAATTTTA TGAGAATATC ATTTCTATCC CACTCTTAAG 133 80 

AATCACTACA TAATAAATCT TTAGTGGTTC TTTAACATTG ATGTCACACT CCATGCCATT 13440 

GAGTTGTAAT ATATCTTTTT TAGGTATAAA TGTTGTCGAA TAAACAACAA GTTGTCCAAA 13500 

AGATXXAAAT CTAAACAAGA TATAGCCAGC AATTTAATAT TTGTAATAGA TAAAATGCTA 13560 

AGTTTGATAT ATAATAAATT TAAGTAATTG TATAATAATA TGAATTACAA ACATCTAAGA 13620 

AGAAACATAG GAGGCATCAT ATTATGAGTA ATAAAGTTCA ACGTTTTATA GAAGCAGAAA 13680 

GGGAGTTAAG TCAGTTAAAG CACTGGTTAA AAACAACACA TAAGATTTCA ATTGAAGAAT 1374 0 

TTGTAGTCCT TTTTAAAGTG TATGAAGCTG AAAAGATTAG CGGTAAAGAA TTGAGGGATm 13800 

CATTACATTT TGAAATGCTA TGGGATACAA GTAAAATCGA TGTGATTATC CGTAAAaTCT 1386 0 

ATAAAAAAGA GCTTATTTCT AAATTGCGTT CTGAAACGGA TGAAAGACAA GTATTCTATT 13920 

TCTATAGTAC TTCTCAAAAG AAAT TG TT A G ATAAAATTAC TAAAGAAATA GAAGTGTTAA 13980 

GCGTTACAAA CTAAAAACTT aAAAAgcaTG CCAATCTCTA TTCATCATAA TTGCGTCTTG 1404 0 
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GTTCATGGCA TTTCTAGTTA CATGACGTCC ATGAATTAAG AAGTAAACAA GCATAGTAAT ~1416l 

GATTGCTAAA GCGGCCATAA AGCCGAAGAT TTCACTATAT GAAAACATAT GAGTAAATAA i 422( 

CCCAAGGAAT GATGGACCGA AGCCGACACC TGCATCTAGA CCAACGTAAA AAGTAGATGT 1428< 

CGCGATACCA TATTTAATCG GGGGTGAGAC TTTTATCGCA ATAGATTGCA TTGCAGATGA 1434 ( 

TAAATTTCCA TACCCTAAAC CTAGGCAAGC ACCAGCAAGT AATATTAACC AGCTTTGATA 144 0C 

GCTTGAAATT AAG CAT ACAA ATGAAAGGAA AAGCATGATA AATGCTGGGT AGACAATAAT 1446c 

ATTTTCATTT TTATCATCCA TCAATCTACC AGCAATAGGT CTAGTAATTA ACGATGCTAT 14 52C 

AGCATAGCAA ATAAAGAAAT AGCTTGCTGC AGTGACTAGG TGTCGCTCTA AAGCAAATGC 14 56C 

TTGTAAATAA GTTAGGATGG ACGCATAGGT AACGCCAATT AAAAGCATAA TTACAGCAAC 1464 0 

AGGAATGGCC TCTTTTGCAA TAAATTGATG AATACTAAAT CTTGGTTTAT CAATGACATT 14 700 
20 AGTTTCAGTT TTGTTATTTG TTACTTCGAA ATCAACTTTT ATAAATAATG AGATAATGAG ~ 14 760 

TCCGAGTATG CCTAATATGA CACAAATAAT AAACAGTAAG TCAATTGGGT ATTTTGTAAT 14 82 0 

AAGTAACATG CCTAGAAATG GGC CAATCGC TGTACCTAAT ACTAAACTTA AGGAAAATAA 14 8 80 

25 ACTGATGCCT TCACTTTTTC TATTAACAGG GGTAACGTAT GCCGCAATAG TACCTGTTGC 14 94 0 

AGTTGTCACA ACTGCAGTTG CGATACCGTT TATGAGACGT ACAAAGATTA AAAAAGCTAA 15 000 

AGATCCATCA ATAAAATAAA GTAATTGCGZ GATAATTAAA GCAATTAAAC CAATAAATAA 15060 

30 TAATCGTTTA GGTCCrATTT sATTTACAAA TTTACCTGTA GCAAATCGA 1510 9 
(2) INFORMATION FOR SEQ ID NO: 45: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9072 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 

GAGAGTCAAT GGCAAGAAGA ATATAAATAT TTGAGAGCGT TAATCTTTAA TGAAACAGAA 60 

TTAGAGGAAG CGTATAAATG GATGCATCCT TGTTACACGT TGAATAATAA AAATGTAGTA 120 

CTTATCCATG GCTTCAAAAA TTATGTTGCA CTATTATTTC ATAAAGGTGC CATTTTGGAG 180 

GATAAATATC ATACACTCAT TCAACAGACT GAAAAGGTGC AAGCAGCTCG TCAGTTACGA 24 0 

TTTGAAAATT TAACAGAGAT TCAAGCACGT ACCGAAGAAA TTAAATATTA TCTAGCCGAA 300 

GCAATTAAAG CTGAAAAAGC TGGTAAAAAA GTTGAAATGA AGAAAACAGA GGAATATGTT 360 
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AAATTAACGC CAGGCAGACA ACATCAATAT ATATATCATA TTGGACAAGC TAAACGCAgT 480 

GgAACAAGAC AAAAGCGTGT TGAAAAGTAT ATTAACCAAA TACTAGAAGG TAAAGGGATG 54 0 

5 CATGATAAGT AATTAATGAG TAAAGCATAC CGGTTATACA ACAACATACA AGATGACACG 6 00 

AAACAACCAA TGGCTCATGC TGTTGGTTGT TTTTTTAGGT GTGTCTGTCA TGGGCAACAC 660 

TTTGACGTTG GAATTCCGTT ACAGGCTTGG GAGTAGAAAA TGTTAGCAAA AGGCAAGGGT 72 0 

W GTCTACAATG AATGATGAAG ATATTAAAAT ATAAGGATGA CTTTGTGAGT GGCGGATGGG 780 

CGGTTGTCCG TCTGTAACAA TGGATGCGTG TGCATTATTA CAAAAATTCG ACTTTTGTAA 84 0 

TAATATTTCA CATTTTCGAC ACTTTTTTGC TATAAAACAA CCAATTGAGC GATAATAAAT 900 

15 

TCGCTTTTAA AAAATATGAG TTATCTATTT AGTTGCCAAA GATAAAATAA TAATGTTTAA 960 

TAACATCATA TAGAGTATGT TAGTTTTAAA TGTCGAATAT ACGAATGTGc AAACAAAGTA 1020 

ATCGGTAGAA ATTCAACATA CATAGCGCCG TTTACTGTTA AGTATTCACA TTACAGATGA 1080 

20 

AAAATATAAA ATTCTACATA ATCAAGACCA TGATGTGTAC TTGTTTAACT TATGACTCTA 1140 

TTTGTTTAAC AATTGCGATA ATGGTCTTTT TATTTTATGC GTATCATTCG TCATATTTTT 12 00 

25 TATGAGGAAG GAGAAATGAT TATGTTAAGT ATTAAGCATT TAACGAAAAT TTATTCTGGT 1260 

AATAAAAAGG CAGTAGATGA CATCTCTTTA GATATTCAAT CTGGGGAATT TATCGCATTT 1320 

ATTGGAACCA GTGGAAGTGG CAAAACGACT GCTTTAAGAA 'TGATAAACCG TATGATTGAA 13 8 0 

30 GCGACAGAAG GACAAATTGA AATTGATGGT AAAGATGTTC GGAGTATGAA TCCTGTCGAA 144 0 

TTGCGTAGAA ATATTGGCTA TGTTATTCAA CAAATTGGCT TAATGCCTCA TATGACGATT 1500 

AAAGAGAATA TTGTGTTGGT ACCCAAATTG TTGAAATGGA CTAAAGAGGA AAAGGATAAA 1560 

35 CGTGCAAAGG AATTAATTAA ACTTGTGGAT TTACCGGAGT CATTTTTAGA GCGTTATCCA 1620 

G CAGAACT AT CAGGTGGGCA ACAACAACGT ATCGGTGTTG TAAGAGCACT TGCGGCCGAA 16 80 

CAAGATATTA TTTTAATGGA TGAACCTTTT GGTGCATTGG ATCCTATTAC GAGAGATACG 1740 

40 

TTACAAGATT TAGTTAAAAC GTTACAACGA AAATTAGGCA AGACGTTTAT CTTTGTAACA 1800 

CATGATATGG ATGAAGCGAT TAAATTAGCA GACAAAATTT GTATTATGTC AGAAGGTAAG 1860 

GTGGTGCAAT TTGATACGCC AGACAATATT TTAAGACATC CCGCAAATGA TTTTGTACGT 1920 

45 

GATTTTATAG GACAAAATAG ACTGATTCAA GACCGTCCCA ATGACAAGAC TGTAGAAGGT 1980 

GTAATGATTA AACCAATCAC GATACAAGCA GAAGCAACAC TGAATGACGC CGTTCATATT 204 0 

ATGAGACAAA AACGTGTTGA TACTATTTTT GTAGTAGATA GTAATAACCA TTTACTAGGT 2100 

SO 

TTCTTAGACA TTGAAGATAT AAATCAGGGT ATACGTGGAC ACAAAAGTTT ACGAGACACC 216 0 

55 
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ATTTTAAAAA 


gaaacgttag 


GAATGTACCT 


GTCGTAGATG 


ATCAACAGCG 


TTTAGTAGGA 


2260 




CIuATTACG C 


GTG C CAATGT 


TGTTGATATT 


GTATATGACA 


CGATTTGGGG 


CGATAGTGAG 


2340 


5 


*P ft ^ ft /•* /■< 
GATACAC5TGC 


AAACAGAACA 


TGTGGGGGAA 


GACAcTGCGT 


CCTCAAAAGT 


GCATGAGCAA 


2400 




GACACTACTA 


ATGTCAAAGT 


ACGTGACATA 


GGAGATGATA 


AATCATGATT 


GAGTTCCTAC 


2460 


10 


ATGAACATGG 


TGGACAGTTG 


ATGTCGAAAA 


CACTGGAACA 


TTTCTATATT 


TCTATAGTGG 


2520 


CATTATTACT 


TGCCATCATT 


GTTGCAGTAC 


CTATAGGCAT 


TTTATTATCA 


AAAACAAAGC 


2580 




GAACTGCCAA 


TATTGTATTA 


ACTGTGGCAG 


GTGTCTTACA 


AACTATTCCA 


ACACTAGCTG 


2640 


15 


TACTTGCTAT 


TATGATACCG 


AirrrrGGTG 


TTGGTAAAAC 


GCCTGCAATT 


GTAGCGCTAT 


2700 


TTATTTATGT 


ATTATTACCT 


ATTTTAAATA 


ACACGGTACT 


CGGTGTTCAA 


AATATTGATA 


2760 




GCAACATTAA 


AGAAGCTGGA 


AAAAGTATGG 


GAATGACACA 


ATTTCAATTG 


ATGAAGGATG 


2820 


20 


TTGAATTGCC 


GTTAGCATTG 


CCGCTTATCA 


TTGGTGGCAT 


TCGTTTGTCA 


TCTGTGTATG 


2880 




TAATTAGTTG 


GGCTACACTT 


GCAAGTTATG 


TAGGTGCGGG 


TGGATTAGGT 


GATTTCATTT 


2940 




TCAATGGTTT 


AAATTTATAT 


GATCCACTGA 


TGATTGTAAC 


TG CAACGGTA 


CTCGTTACTG. 


3000 


25 


CACTAGCATT 


AGGTGTTGAT 


GCCTTATTAG 


CTTTAGTTGA 


AAAATGGGTA 


GTTCCCAAAG 


3060 




GCTTAAAAGT 


ATCTGGATAA 


TTAGGAGGCT 


AAGATAATGA 


AGAAAATTAA 


ATATATACTT 


3120 




GTCGTGTTTG 


TCTTATCGCT 


TACCGTATTA 


TCTGGATGTA 


GTTTGCCCGG 


ACTAGGTAGT 


3180 


30 


AAGAGCACGA 


AAAATGATGT 


CAAAATTACA 


GCATTATCAA 


CAAGCGAATC 


GCAAATTATT 


3240 




TCACATATGT 


TACGGTTGTT 


AATAGAGCAT 


GATACACACG 


GTAAGATAAA 


GCCAACATTA 


3300 




GTAAATAATT 


TAGGGTCAAG 


TACGATTCAA 


CATAATGCCT 


TAATTAATGG 


GGATGCTAAT 


3360 


35 


ATATCAGGTG 


TTAGATATAA 


TGGCACAGAT 


TTAACGGGAG 


CTTTGAAGGA 


AGCACCAATT 


3420 




AAAAATCCTA 


AGAAAGCAAT 


GATAGCAACA 


CAACAAGGAT 


TTAAAAAGAA 


ATTTGATCAA 


3480 


40 


ACGTTTTTTG 


ATTCGTATGG 


TTTTGCGAAT 


ACGTATGCAT 


TCATGGTAAC 


GAAGGAAACC 


3540 


GCTAAAAAAT 


ATCATTTAGA GACAGTTTCA 


GATTTAGCAA 


AGCATAGTAA 


AGATTTACGT 


3600 




TTAGGTATGG 


ATAGTTCATG 


GATGAATCGT 


AAAGGCGATG 


GCTATGAAGG 


ATTTAAAAAA 


3660 


45 


gagtatggtt 


TTGACTTTGG 


TACAGTGAGA 


CCAATGCAAA 


TAGGTCTAGT 


CTACGACGCA 


3720 


ttaaactcag 


AGAAGTTAGA 


CGTTGCATTA 


GGTTATTCTA 


CAGATGGTCG 


AATTGCGGCG 


3780 




tatgatttga 


AAGTACTTAA 


AGATGATAAA 


CAATTTTTCC 


CACCTTATGC 


TGCGAGTGCT 


3840 


50 


gttgcaacaa 


ATGAATTATT 


ACGGCAACAC 


CCAGAACTTA 


AAACGACGAT 


TAATAAGTTG 


3900 




acaggaaaga 


TTTCGACTTC 


AGAGATGCAA 


CGCTTGAATT 


ATGAAGCGGA 


TGGTAAAGGT 


3960 
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AAAGGTGGTC 


ATAAGTAATG 


GAAGGTAATT 


TATTACAGCA 


ATTATTCAAT 


TATTATGTTA 


4080 




CGAACTTTGG 


TTATCTATGG 


GATTTATTTT 


TCAAACACTT 


ATTAATGTCT 


GTCTATGGTG 


4140 


5 


TGCTGTTTGC AgCTTTAATT 


GGTATTCCAT 


TGGGAATCTT 


GCTTGCaAGA 


TACACAAAAC 


4200 




TTTCTGGATT 


TGTAATTACA 


ATTGCAAATA 


TAATTCAAAC 


AGTTCCAGTC 


ATTGCAATGT 


4260 




TAGCTATTTT 


AATGTTAGTC 


ATGGGCTTAG 


GTTCAGAAAC 


AGTAGTTTTA 


ACAGTGTTTT 


4320 


10 


TATATGCGTT 


ACTTCCAATT 


ATAAAAAACA 


CTTATACTGG 


TATAGCTAGT 


GTTGATGCGA 


4380 




ATATTAAGGA 


TGCTGGCAAA 


GGTATGGGAA 


TGACACGCAA 


TCAAGTGCTA 


CGAATGATTG 


4440 


15 


AATTACCGTT 


ATCTGTTTCG 


GTTATTATCG 


GTGGCATTCG 


TATTGCCTTG 


GTTGTTGCGA 


4500 


TAGGTGTTGT 


TGCCGTTGGA 


TCATTTATAG 


GAGCACCTAC 


GCTTGGTGAC 


ATTGTGATTC 


4560 




GTGGTACAAA 


TGCGACGGAT 


GGCACAACGT 


TTATTTTAGC 


AGGTGOGATT 


CCGATTGCTA 


4620 




TCATTGCAAT 


CGTCATTGAT 


GTACTATTAA 


GATTTTTAGA 


AAAACGATTA 


GACCCAACAA 


4680 




CACGACATCG 


TAAAAATCAA 


TCTAATCATC 


GGCCGCAAAG 


TATTAATATG 


TAATAGTAGA 


4740 




AGATGTTTAT 


AATTTAGCGA 


TTTCGTTTCA 


TGATTTATAA 


AAAATGAGGC 


TACTCAAGGA 


4800 


25 


GCTCAAATAA 


TCTTTGAGTA 


GCCTTTTTAT 


AGGTTGTGTT 


TGtATGCGTT 


TACACTAAAA 


4860 




TAGCAATTAT 


TATCATGAAA 


GTTTTTGGAT 


AAAAAGCGTT 


AATTATTGTA 


AAAATACTAA 


4920 




AAAATGAGAT 


GTTTTATTTA 


TAATTTTCTG 


CAAATTTATG 


ATATTGTTTC 


TTAATATATC 


4980 


30 


ATATTAAAAA 


TTTGTTTTTC 


TTAAACATAG 


GAGGCTTATC 


TAATTCATGG 


ACACATCAAA 


5040 




ACAATTTAGA 


GGTGACAACC 


GATTGCTTTT 


GGGTATCGTT 


TTAGGGGTTA 


TTACCTTTTG 


5100 




GCTATTCGCG 


CAGTCACTTG 


TTAATCTTGT 


TGTCCCATTA 


CAATCAACAT 


ATAGTAGTGA 


5160 


35 


CGTTGGAACG 


ATAAATATCG 


CTGTTAGCTT 


ATCTGCCTTA 


TTTGCTGGTT 


TGTTTATCGT 


5220 




AGGTGCTGGT GATGTTGCTG ATAAATTTGG TCGCGTCAAA ATTACTTATG TAGGATTGAT 


5280 


40 


ATTAAATGTT 


GTAGGTTCAT 


TACTCATCAT 


CATTACACCT 


TTGCCAGCAT 


TTTTAATTAT 


5340 


AGGTAGAATA 


ATTCAAGGTT 


TGTCTGCAGC 


ATGTATTATG 


CCATCAACAC 


TTGCTATTAT 


5400 




TAACGAATAT 


TATATTGGTA 


CAAGAAGACA 


ACGTGCCTTA 


AGCTATTGGT 


CTATTGGTTC 


5460 


45 


TTGGGGTGGT 


AGTGGTATTT 


GTACGTTGTT 


TGGTGGCTTA 


ATGGCTACAT 


ATATAGGTTG 


5520 


GCGTTCAATA 


TTTGTTGTTT 


CAATTCTATT 


AACATTATTA 


GCAATGTACT 


TAATCAAACA 


5580 




TGCAC CTGAG 


ACTAAAGCAG 


AACCAATCAA 


AGGTATGAAA 


GCAGAAGCTA 


AAAAGTTTGA 


5640 


50 


CGTTATTGGT 


TTAGTCATTT 


TAGTAGTGAC 


GATGTTAAGT 


TTAAATGTAA 


TCATCACACA 


5700 




GACGTCTCAT 


TTTGGTTTAG 


TTTCACCGTT 


AATTCTAGGT 


TTAATTGTTG 


TGTTTATCTG 


5760 
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AATTTTTAAA AATAGAGGAT ACAGTGGTGC 
AGCAGGTGGT GCACTTATCG TTATTAACAC 

5 TTCGCAAACG GGTTATATTT CATTAACGTA 

AGGTGAAAAG ATTTTATCTC AACATGGTCC 
TACAGTGATT GGGTTAATCT TATTGTCGTT 

70 ATCTAGTATA GTTGGATATT TATTGTTTGG 

AACTGATACA GCAGTTGCTA GTGCGCCAGA 
TAAAATGGCG TCATCATTAG GAAATGCATT 

IS 

TGTGTTAGCA GCTAATTTAA ATTTGAACTT 
CTTGCTAGCA ATTGTTG CAT TTTTAGTCAT 
TTTGTAAAAC TGAAATGAAA GCAAGTTATT 

20 

AGTAAGTTTA TCATACACAC TTAATGTTGC 
TTATAGACGA TAAAAGCTGT GTGCATATTA 

2S GCGAAAGTAG TATTTTTAAA ATGAACAACA 

TGCAATTGCT GGATCGGGTG CATTAGGTAG 
ATATGATGTC ACACTTATTG ACGGATATAC 

30 ATTAAATATA A CGATT AATG GAGAGGCATT 

TGATCAACCG GACGAAAGCA TTTACGATGT 
AAAAGAAGTG ATGGAAGATA TGAAGCCACA 

35 GATGAATGGT CTGAAGCATG AAGAAGTCAT 

CAGACGTGTT ACGACTTGGA CGGCAGGTCT 
TAGTGGACCA GTTGAAATAG GTGAACTAGT 

40 TGCTGATTTA CTTAAOGAAG CGGAATTGAA 

GATTTGGAAA AAGATTTGTG TTAATGGTAC 
TAATATGGCA TCGCTGAATG AAAGTAGTTA 

45 

AGAAATAGTG CATGTAGCGA CGATTGATAA 
ATATTTAGTT GATTTAAATG AAaAAGTTGG 
AATTGTTAAT AATAGAAAAA CTGAAATTGA 

SO 

TAAACAACGT CaTATTGAAG CGCCAGTCAA 

SS 
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AACTATTTCA AACTTCTTAT TAAATGGTGT 5880 

GTATTATCAA CAACAATTAG GATTTAATTC 594 0 

TTTAATAACA GTGTTGTCAA TGATTCGTGT 6000 

GAAGCGCCCA CTATTACTAG GAAGTGGCTT 6060 

AACATTTTTA CCAGAAGTGT GGT AT AT CAT 6120 

TACTGGTTTA GGATTATATG CTACACCATC 6180 

TGATAAGTCG GGTGTTGCTT CAGGTGTGTA 6 24 0 

TGGAGTAGCA GTATCTGGTA CGGTTTATAC 6 3 00 

AGGTGGTTTC ACAGGTATGA TGTTTAATGC 6360 

TTTACTATTA GTTCCTAAAA ATCAAACGAA 64 20 

ATGTAGGGAT TTTAAAGGAA ATTTTGTGAA 64 80 

GTATTGACGT TTAATGTTAG GTGTGTTCTT 6 54 0 

AG CG AATG AT TTTCAAATTG ACGCTAATAT 66 00 

ACGATGAAGA GGGGTTTATA GGATGAAAAT 666 0 

TGGCTTTGGT GCCAAACTAT TTCAAGCAGG 6720 

ATCTCATGTT GAAGCGGTTA AGCAACATGG 678 0 

CGAGTTAAAC ATT C CGATGT ATCATTTTAA 6 84 0 

TGTCTTTCTA TTTCCAAAGT CTATGCAATT 6 900 

TATTGATAAT GAAACGATCG TCGTATGTAC 696 0 

TGCGCAGTAT GTTGCTCAAT CACAAATTGT 7020 

TGAAAGCCCT GGACACAGTC ATTTACTTGG 7080 

GGATGAAGGT AAAGAAAATG TTATAAAAGT 714 0 

TGGTGTCATT AGTAAAGATT TATACCAATC 7200 

GGCAAATGCA TTAAGCACAG TGTTGGAGTG 7260 

TGCGAAGTGT TTGATTTATA AATTAACGCA 7320 

TGTTCATTTA AATGTTGATG AAGTATTTGA 7380 

TGCGCATTAT CCATCCATGT ATCAAGATTT 744 0 

TTATATTAAT GGCGCAGTTG CAACATTAGG 7500 

TCGCTTTATT ACTGATTTAA TTCATACTAA 7560 
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CAATCACGTG ATATTACGGT CATTATTAAG ATTGAAATGT AATAAATAAA GAACAGCAGT 76 80 

AAGGTACTTT CAAATTGAAA TGATCTTGGT GCTGTTTTTC TTGATTGATC TTCGTCATAA 7740 

TTCAGATTTG TCATAGGcTA CGACATACTA TTAGTATTTA CTAGACAGTT TTTACGACGA 7800 

CACTTTGAAA AATTTTGAGG CAAATCATTT GGAAGTCTCA CGTGAATTTT GTAAACTCAT 7660 

CAAGCAAGTA ATTATATTAA AAAGACAAAT AGAGAAAAGG TGTTTATAAT GAGTAAAATT 7920 

TTTGTAACTG GTGCAACGGG CCTTATTGGC ATTAAATTAG TTCAAAGACT AAAAGAAGAG 7980 

GGGCATGAGG TTGCTGGTTT TACTACATCT GAGAATGGTC AACAAAAGCT AGCTGCTGTT 804 0 

AATGTAAAAG CATATATTGG TGATATATTA AAAGCTGATA CTATTGATCA AGCGTTAGCA 8100 

GATTTTAAAC CAGAAATCAT TATCAATCAA ATTACGGATT TAAAAAATGT TGATATGGCA 8160 

GCAAATACGA AAGTACGTAT TGAAGGTTCT AAAAACCTAA TTGATGCGGC GAAAAAGCAT 8220 

GACGTTAAGA AAGTAATTGC CCAAAGTATT GCCTTTATGT ATGAACCTGG CGAAGGATTA 8280 

GCAAATGAGG AAACTTCACT TGATTTTAAC TCAACTGGCG ATAGAAAAGT AACGGTTGAT 8340 

GGTGTGGTTG GTTTAGAAGA AGAAACGGCT CGTATGGATG AATACGTTGT TTTACGTTTT 8400. 

GGCTGGTTAT ATGGCCCAGG TACTTGGTAC GGAAAAGATG GCATGATTTA TAATCAATTT 8460 

ATGGATGGTC AAGTGACACT TTCAGATGGC GTAACATCAT TTGTGCATCT TGATGATGCA 8520 

GTTGAAACAT CTATTCAAGC TATTCATTTT GAAAATGGTA TCTATAATGT AGCAGATGAT 8 580 

30 GCACCTGTTA AAGGTTCTGA ATTTGCAGAA TGGTATAAAG AACAACTTGG TGTTGAACCA 864 0 

AATATTGATA TTCAACCTGC GCAACCATTT GAACGTGGCG TAAGCAATGA GAAGTTTAAA 8700 

GCGCAAGGTC GTACTCTGAT TTATCAAACT TGGAAAGATG GCATGAATCC AATTAAATAA 8760 

TAATTTATCC GTTTAATATA CAAAGAATAA AGACTTGGTC GAATCGTGGA TGATATATTA 8820 
TCAAACGCAC GGCTCGAACA AGTCTTTTTT ATTATGTCTT CGTTATCTTT GTATGAAGGA 8880 
ATAACAGAAT TACAATTAAT GTACTGAATA ATGCAATTAA TGTTGTGATT AGTGCTAATT 8940 
TAATTTCTAT TGGTAGCCAA GTCAGTACAA AAGACCAATT ATTGCTACCG AGAATGAGAT 9000 
ATGGTAATGC ATATAATATG AGCGCTAAAG CGATACATAT ACATAATGAT AACCAACTCA 9060 
ATACAGCAAT CC 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16826 base pairs 
so (B) TYPE: nucleic acid 

<C) STRANDEDNESS : double 
(D) TOPOLOGY: linear 
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<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

GTGGAACAGC TGTAACTATA TCATTTCTTT CAACATTTAT TGGGAAAATG TTAGCTACAT 60 

TTCTATATCC GATTAATAAT GTAGTACTTT CATATATnTC TGTAAATGAA AGTGACAATA 120 

TAAAGAAGCA ATATTTGaAA ACTAATCTAA TTGCTATAGC TGCCCTATGT TTAGTCATGA 180 

TTATATGTTA TCCAATTACA ATAATTATTG TCTCTTTACT GTATAACATT GATTCAAGTT 240 

TATATTCGAA GTTTATTATT TTAGGTAATA TAGGTGTTTT ATTCAATGCA GTGAGTATTA 300 

TGATCCAAAC TTTAAATACA AAACACGCAT CAATAACATT ACAAGCGAAT TATATGACGC 3 60 

TTCACACGAT TACATTTATA TTCATAACTA TTTTAATGAC AATTGCGTTT GGTCTAAATG 4 20 

GATTCTTTTG GACAACGCTG TTCAGCAACA TTATTAAGTA TGTGATTTTA AATATTATAG 460 

GTTTAAAGTC TAAATTCATT AATAAAAAGG ACGTCGATTA GATGAGTGAA AAAAAGATTT 54 0 

TGATTTTATG TCAGTATTTT TATCCGGAAT ATGTATCTTC TGCGACGTTA CCAACTCAAT 600 

TGGCGGAAGA TTTAATTGCG AATCACATTA ATGTCGATGT CATGTGTGGA TGGCCATATG 660 

AATATAGTAA TCATAAACAG GTTTCTAAAA CCGAGATGCA TCGTGGTATT CGCATTCGAC 720 

GTCTCAAGTA TTCGAGGTTT AATAACAAAA GTAAGGTTGG AAGGATCATC AATTTCTTTA 780 

GTTTATTTTC AAAATTCGTG ATTAATATAC CTAAAATGTT GAAATATGAT CAGATTCTTG 840 

TTTACTCTAA TCCACCAATC TTGCCATTAA TACCAGACGT TTTACACAGA CTGCTTAAGA 900 

30 AAAAATATTC TTTTGTGGTG TATGATATAG CACCTGATAA TGCGATTAAG ACAGGTGCAA 960 

CTCGTCCAGG TAGCATGATT GATAAGCTGA TGCGTTACAT TAATAGACAT GTCTACAAGA 1020 

ATGCTGAAAA TGTCATTGTC CTTGGTACGG AAATGAAAAA CTACTTACTA AATCATCAAA 1080 

TTTCTAAAAA TGCTGACAAT ATCCATGTGA TTCCTAACTG GTATGACATG CGTCAATTAC 1140 

AAGflCCAATCG TATCTATAAT GACACATTTA AAGCTTACCG TGAGCAATAC GACAAAATTT 1200 

TATTGTATAG CGGTAATATG GGGCAGTTAC AGGATATGGA GACACTTATC TCATTTTTAA 1260 

AATTAAATAA GGATCAGTCT CAAACGTTAA CAATACTTTG TGGTCATGGT AAGAAATTTG 1320 

CAGATGTCAA AACGGCAATA GaAGACCATC GTATTGAAAA TGTTAAAATG TTTGAGTTTT 1380 

TAACAGGTAC AGACTATGCT GACGTATTAA AAATTGCGGA TGTATGTATT GCATCGCTGA 144 0 

TTAAAGAAGG CGTCGGTTTA GGCGTGCCGA GCAAGAATTA TGGCTATCTT GCAGCTAAGA 1500 

AAGCGTTGGT ACTCATCATG GATAAGCAAT CTGATATCGT TCAACATGTT GAACAATATG 1560 

ATGCGGGTAT CCAAATTGAT AATGGCGATG CACATGCCAT TTATAACTTC ATCAACACTC 1620 

ACTCGAGTAA GGAATTGCAC GAGATGGGTG AGCGCGCACA TCAACTGTTT AAAGATAAAT 1680 
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AAGCGATTAT 


TCGATGTAGT 


GAGTTCAATA 


TATGGTTTAG 


TAGTTTTAAG 


TCCGATTCTG 


1800 




TTAATTACAG 


CATTACTAAT 


TAAAATGGAa 


TCACCTGGAC 


CAGCCATTTT 


CAAACAAAAA 


1860 


5 


AGACCGACGA 


TTAATAATGA 


ATTGTTTAAT 


ATTTATAAGT 


TTAGATCAAT 


GAAAATAGAC 


1920 




ACACCTAATG 


TTGCAACTGA 


TTTAATGGAT 


TCAACATCGT 


ATATAACAAA 


GACAGGGAAG 


1980 




GTCATTCGTA 


AGACCTCTAT 


TGATGAATTG 


CCACAATTAT 


TGAATGTTTT 


AAAAGGAGAA 


2040 


10 


ATGTCAATTG 


TAGGTCCTAG 


ACCAGCGCTT 


TATAATCAAT 


ACGAATTAAT 


CGAAAAACGT 


2100 




ACAAAAGCGA 


ACGTGCATAC 


GATT AG AC CA 


GGTGTGACAG 


GACTAGCTCA 


AGTGATGGGG 


2160 


1S 


AGAGATGATA 


TCACTGATGA 


TCAAAAAGTA 


GCGTATGATC 


ATTATTACTT 


AACACATCAA 


2220 


TCTATGATGC 


TTGATATGTA 


TATCATATAT 


AAAACAATTA 


AAAATATCGT 


TACTTCAGAA 


2280 




GGTGTGCATC 


ACTAATGAGA 


AAAAATATTT 


TAATTACAGG 


CGTACATGGA 


TATATCGGTA 


2340 




ATGCTTTAAA 


AGATAAGCTT 


ATTGAACAAG 


GACATCAAGT 


AGATCAAATT 


AATGTTAGGA 


2400 


ATCAATTATG 


GAAGTCGACC 


TCGTTCAAAG 


ATTATGATGT 


TTTAATTCAT 


ACAGCAGCTT 


2460 




TGGTTCACAA 


CAATTCACCT 


CAAGCAAGGC 


TATCTGATTA 


TATGCAAGTG 


AATATGTTGG 


2520 


25 


TGACGAAACA 


ATTGGCACAA 


AAGGCTAAAG 


CTGAAGACGT 


TAAACAATTT 


ATIVITATGA 


2580 




GTACTATGGC 


AGTTTATGGA 


AAAGAAGGTC 


ATGTTGGTAA 


ATCAGATCAA 


GTTGATACAC 


2640 




AAACACCAAT 


GAACCCTACG 


ACCAACTATG 


GTATTTCCAA 


AAAGTTCGCT 


GAACAAGCAT 


2700 


30 


TACAAGAATT 


GATTAGTGAT 


TCGTTTAAAG 


TAG CAATTGT 


GAGACCACCA 


ATGATTTATG 


2760 




GTGCACATTG 


CCCAGGAAAT 


TTCCAACGGT 


TAATGCAATT 


GTCAAAGCGA 


TTGCCAATCA 


2820 




TTCCCAATAT 


TAACAATCAG 


CGCAGTGCAT 


TATATATTAA 


ACATCTGACA 


G CATTT ATTG 


2880 


35 


ATCAATTAAT 


ATCATTAGAA GTGACAGGTG 


TGTACCATCC 


TCAAGATAGT 


TTTTACTTTG 


2940 




ATAC&TCGTC 


AGTAATGTAT 


GAAATACGTC 


GCCAATCACA 


TCGTAAAACG 


GTATTGATCA 


3000 




ACATGCCTTC 


AATGCTAAAT 


AAGTATTTTA 


ATAAGTTGTC 


GGTCTTTAGA 


AAATTATTCG 


3060 


40 


GCAATTTAAT ATACAGCAAT 


ACGTTATATG 


AAAATAATAA 


TGCACTTGAA ATTATTCCTG 


3120 




GAAAAATGTC 


ACTTGTTATT 


GCGGACATCA 


TGGATGAAAC 


GACAACCAAA 


GATAAGGCAT 


3180 


45 


AAGTCATCTA 


TTAAATAAAA 


TCAACATACA 


AATCGTTTTA 


TTTGGAGGTT 


ATAGTATGAA 


3240 


GTTAACAGTA 


GTTGGCTTAG 


GTTATATTGG 


TTTACCAACA 


TCAATTATGT 


TTGCAAAACA 


3300 




TGGcGTCGAT 


GTGCTTGGTG 


TTGATATTAA 


TCAG CAAACG 


ATTGATAAGT 


TACAAAGTGG 


3360 


SO 


TCAAATTAGT 


ATTGAAGAAC 


CTGGATTACA AGAGGTTTAT 


GAAGAGGTAC 


TGTCATCGGG 


3420 




AAAATTGAAG 


GTATCTACAA 


CGCCAGATGC 


ATCTGATGTT 


TTTATCATTG 


CCGTTCCGAC 


3480 
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1 n\9 1 A 1 X 1 1 A 


iUtl ill IAu 


AAAAAGGAAA 


TACCATTATT 


GTAGAGTCGA 


CAATTGCGCC 


3600 




TAAAACYIATfT 


Unlunl X X XV? 


1 AaaACGAGT 


CATTGAAAAT 


TTAGGGTTTA 


CAATAGGTGA 


3660 


5 


AGATATTTAT 


•T-T - & r^z^r 1 ^ fc 

X InUlUWli X 


\f 1 v- w\vJAACG 


TGTACTGCCA 


GGAAAAATTT 


TAGAAGAATT 


3720 




AGTTCATAAP 




1 IvjOUviGTGT 


GACTGAAGCT 


TGTATTGAAG 


CGGGTAAACG 


3780 


10 




>\ 1 1 111. 


AWjGAGAAAT 


GATTGAAACA 


GATGCACGTA 


CTGCTGAAAT 


3840 


GAGTAAGCTA 


ATGGAAAACA 


CATATAGAGA 


CGTGAACATT 


GCTTTAGCTA 


ATGAATTAAC 


3900 




AAAAATTTGC 


AATAACTTAA 


ATATTAATGT 


ATTAGATGTG 


ATTGAAATGG 


CAAACAAACA 


3960 


15 


TCCGCGTGTT 


AACATCCATC 


AGCCTGGTCC 


AGGTGTAGGC 


GGTCATTGTT 


TAGCTGTTGA 


4020 




TCCGTACTTT ATTATTGCTA AAGACCCTGA AAATGCAAAG TTAATTGAAA 


CTGGACGTGA 


4080 




AATTAATAAT 


TCAATGCCGG 


CCTATGTTGT 


TGATACAACG 


AAGCAAATPA 


TCAAAGTGTT 


4140 


20 


GAGCGGGAAT 


AAAGTCACAG 


TATTTGGTTT 


AACTTATAAA 


vjvj ivjni w x x w 


ATGATATAAG 


4200 




AGAATCACCA 


GCATTTGATA 


TTTATGAGCT 


ATTAAATCAA 




TAGAAGTATG 


4260 




TGCTTATGAT 


CCACATGTTG 


AATTAGATTT 


TGTGGAACAT 


GATATGTCAC 


ATGCTGTCAA 


4320 


25 


AGACGCATCG 


CTAGTATTGA 


TTTTAAGTGA 


CCACTCAGAA 


TTTAAAAATT 


TATCGGACAG 


4380 




TCATTTTGAT 


AAAATGAAGC 


ATAAAGTGAT 


TTTTGATACA 


AAAAATGTTG 


TGAAATCATC 


4440 




ATTTGAAGAT 


GTATCGTATT 


ATAATTATGG 


CAATATATTT 


AATTTTATCG 


ACAAATAAAA 


4500 


30 




TAkiGoCATAC 


ATGATTAAGG 


AAAGATAAGC 


TGTCATGTGT 


TTGAACTTCA 


4560 




GAGAGGATAA 


TGTTATGAAA 


AAAATTATGG 


TTATTTTCGG 


TACGAGACCC 


GAAGCAATAA 


4620 




AAATGGCACC 


ATTAGTAAAA 


GAAATTGATC 


ATAATGGGAA 


CTTTGAAGCG 


AACATTU'x'UA 


4680 


35 


TTACAGCACA 


ACATAGAGAT 


ATGTTAGATA 


GTGTGTTAAG 


TATATTTGAT 


ATTCAAGCTG 


4740 




ATCATGATTT 


AAATATTATG 


CAAGATCAAC 


AAACATTAGC 


AGGCCTTACG 


GCGAATGCAC 


4600 


40 


TTGCTAAACT 


TGATAGCATC 


ATTAATGAGG 


AACAACCGGA 


TATGATTTTA 


GTACATGGTG 


4860 


ATACTACAAC 


GACTTTTGTA 


GGAAGTTTGG 


CAGCATTTTA 


TCATCAAATT 


CCGGTCGGAC 


4920 




ATGTAGAAGC 


TGGACTTCGA 


ACACATCAGA 


AATACTCACC 


ATTTCCTGAA 


GAGTTAAATC 


4980 


45 


GAGTCATGGT 


AAGTAATATT 


GCTGAATTGA 


ATTTTGCGCC 


AACAGTAATT 


GCAGCTAAAA 


5040 




ATTTACTTTT 


TGAAAACAAA 


GACAAAGAGC 


GTATCTTTAT 


TACTGGAAAT 


ACAGTTATTG 


5100 




ACGCATTGTC 


AACAACAGTT 


CAAAATGATT 


TTGTTTCAAC 


GATTATTAAT 


AAACATAAAG 


5160 


50 


GCAAGAAAGT 


TGTTTTACTA 


ACAGCGCATC 


GTCGTGAAAA 


TATTGGGGAA 


CCGATGCATC 


5220 




AGATTTTTAA 


AGCAGTAAGA 


GATTTGGCAG 


ATGAATATAA 


AGATGTTGTC 


TTCATTTATC 


5280 
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GGATTGAATT 
ACCTCGTGCT 
TGTTGGTATT 
TAATTGGCAC 
AAGCGTATCA 
GTATTTGTGA 
TACCTTTACG 
GGTGACTAGG 
TACTTTATTT 
TTTATGAGCT 
CCAAATAGTT 
CATTTGTTGG 
TTCTTCCGCA 
CTCCTCTATT 
TTGTATGATA 
TGAAAAACCT 
GCCTCGATGT 
ATTGTAAAAG 
CGTTGGCAGT 
GACGATTAAA 
TGTTTTAGTA 
AAATGAAACG 
ACAACCAAAC 
TTTTGAAAAG 
AAAGTTTTAA 
TATAATGTGA 
AT AGC CAAT A 
AATTATGGTT 
GTGACTAATC 



AATTGAGCCA 
GACAGATTCT 
AAGGAATCAT 
AGATTATGAC 
ACGTATGAGT 
AGCAATAGAA 
TCACAAATAA 
GGTTTTTAAT 
GATAATATTG 
TCTTTAAATA 
AAGATTTTAA 
AAGCCTTCAA 
CTGCCTTTTT 
TTATGATTTG 
AC CATTATGA 
ATTACAATGT 
AAATAAAAGT 
CGATCTGAAC 
AAAGGTAATA 
ATAAGTCGCA 
ATATAACTCA 
ATGTTGAATT 
TATTTAATGA 
TGCAATATGT 
AAATGAGACT 
ATCATATCGT 
AACAAAGGAG 
TATTTATCAA 
CAGCAACTGG 



TTAGATGCGA 
GGTGGTATTC 
ACAGAGCGTC 
AATATTGTTC 
CAAGCGAATA 
TATTATTTTG 
TAAAAAACCC 
ATATTTATTT 
GACTTTGCTG 
CATCGGAATT 
CTTCGTCTGT 
TAGTTTCAAT 
GTAATTGTAA 
ATTTGGGTAA 
TTAATCCTAC 
ATAAGCTAAT 
TTTCTACATA 
TTCGAGCAAA 
CGGCACCTGC 
TTGAAAAAAC 
TGCTAAATAT 
ATCCTTGTCA 
GAATTATTCT 
TTTCGAAAAT 
TCTGTGAGCT 
TTAAAAGCAT 
AGATAATATG 
TGGGGAATTT 
AGAAACACTA 



TTGAGTTCCA TAATTTTACA 
AAGAGGAGGC TCCTACATTT 
CCGAAGGCGT TGAGGCGGGA 
GAAATGTGAA ACAATTGATT 
ATCCATATGG TGATGGACAA 
GATTGCGCAC AGACAAGCCG 
CTAATCATGA AGTTGGTTTA 
TTGATAGTGG TAGCCAATAT 
TCCATCGTCA TCACTTTTTA 
CAACCAATTA TTAAAGCTAT 
ATCCTCGGTA TTTAATGTTT 
AC CTTGTCTA TTGTAAAAAC 
TCTATTTTCT GCCATAAACA 
TGTTTTTACA AATGTAAAGA 
ACGGACTGCA AGAACATCCA 
TAAAATTTTA ATTTTCTGTT 
TTCTTTATAA ATTTTTTGAT 
GCAAAAAACT GCTACGAGTA 
AATACCAAGC GCTGTAAATA 
TC CATTCTAG TACTAATGCG 
AATGTGTATG ATAAGTGCAA 
CATTAACGCA TTTTAAGCGC 
CAAGTATTAT AGTTATATTA 
AAGATTATTT TTATGTGCAA 
GATTATTTTA TAAAATGTAA 
TATTAAATAT GATGCTAAGA 
GCAGTAAACG TTCGAGATTA 
GTTAAAGGTA GCAGTGACGA 
TCACATATTA CAAGAGCAAA 



AATCAATCGT 
GGAAAACCTG 
ACATCGAGAG 
GAGGATGATG 
GCATCACGAC 
GATGAATTCG 
GACAACCAGC 
CATATTTGAA 
AACGTACATT 
CTTCAGATTC 
TAGTGACAAA 
GTTCAATCGT 
TGGGCAATCA 
GTACAGCGGT 
CCATATAAAT 
GTAGCGTGTA 
TAATAAGCCA 
AAAAAGGGGT 
TTAAGCCAAT 
CATGTAATAT 
TGACTCAGTA 
GACTTTCATA 
TGTGTTTTAT 
AAACGACGCA 
AOGCTTACTA 
GATTTATATT 
TATTGCAGAG 
AACAATCGAA 
AGATAAAGAT 



5400 

5460 

5520 

5580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 
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TCAGAACGTG CACAAATGTT G CGTG AT ATT GGTGATAAAT TAATGGCACA AAAAGATAAA 7200 

ATTGCAATGA TTGAAACATT AAATAATGGT AAACCGATTC GTGAGACAAC AGCAATTGAT 726 0 

ATTCCATTTG CTGCAAGACA TTTCCATTAT TTCGCAAGTG TTATTGAAAC AGAAGAAGGT 7320 

ACAGTGAATG ATATCGATAA AGACACAATG AGTATCGTAC GACATGAGCC GATTGGCGTC 73 80 

GTAGGTGCTG TTGTTGCTTG GAACTTCCCA ATGCTATTAG CTGCATGGAA GATTGCGCCA 744 0 

gCCATTGCTG CAGGTAATAC AATTGTGATT CAACCTTCGT CTTCAACACC ATTAAGTTTA 7500 

TTGGAAGTTG CTAAAATTTT CCAAGAGGTA TTACCTAAAG GTGTTGTCAA TATACTAACG 756 0 

GGTAAAGGTT CAGAATCAGG TAATGCAATT TTCAATCATG ATGGTGTAGA TAAATTATCA 7620 

TTTACGGGCT CAACTGATGT AGGTTATCAA GTTGCCGAAG CTGCAGCAAA ACATCTAGTA 7680 

CCCGCTACAT TAGAGCTTGG TGGTAAAAGC GCCAATATCA TATTAGATGA TGCTAATTTA 774 0 

20 GACCTTGCAG TTGAAGGTAT TCAGTTAGGT ATTTTATTCA ACCAAGGTGA AGTATGTAGT 7800 

GCAGGTTCTC GATTATTAGT TCATGAAAAA ATTTATGATC AATTGGTGCC ACGTTTACAA 7860 

GAGGCATTTT CAAATATTAA AGTTGGAAAT CCACAAGATG AAGCTACACA AATGGGTAGT 7920 

25 CAAACTGGTA AGGATCAATT AGATAAAATT CAATCATATA TTGATGCAGC AAAAGAATGA 7 980 

GATGCACAAA TTTTAGCAGG CGGTCATCGC TTAACTGAAA ATGGATTAGA TAAAGGGTTC 804 0 

TTCTTTGAGC CGACATTAAT TGctGTGCCA GACAATCATC ACAAATTAGC ACAAGAAGAA 8100 

ATATTTGGAC CAGTGTTAAC AGTGATTAAA GTGAAGGACG ATCAAGAAGC AATTGATATA 8160 

GCTAATGATT CTGAGTATGG TTTAGCAGGC GGTGTATTTT CTCAAAATAT CACACGTGCA 8220 

TTAAATATTG CTAAAGCTGT ACGTACAGGA CGTATTTGGA TTAACACTTA CAACCAAGTA 82 80 

CCAGAAGGCG CACCATTTGG TGGTTATAAA AAATCAGGTA TCGGTCGAGA AACTTATAAA 8340 

GGTGCGTTAA GTAACTATCA ACAAGTTAAA AATATTTATA TTGATACAAG CAATGCTTTA 84 00 

AAAGGTTTGT ACTAGAATAA ATATCGTTTC TGAAGCGTGT TTGTAGGTCA GTCTAGCGGT 84 60 

AAGTCTTAAC ATTTAACGGC GTTGTTTAGA TTTTAAGCAA AACAAAATAT ATAGGAACAC 8520 

GTATCATGAT ATTAGGATAT AATGACTAAA ATAATAGCAG TAGGATGGTT TTTAATTGCA 8580 

AATCATCTTA CTGCTGTTTT TAATTATGCT AATTTGCGAT GCGGCTATTA TAAGGACAGA 864 0 

GTTGTTTATT AATTATGGTG ATTTAGAAAT ATGAAGTTCA ATATGCAAAG TCATCGTTTG 8700 

TTTTAATATG CGGAACAATC ATTAAAGTTA TTGCGATTTT TTGAACTTAA TGAAACTAAA 876 0 

so CAATAAATTT GAGATACTTT TTTGTCATTT TTATGTAACT AACACAATAA TCTCGTACAT 8820 

TATTAAAATT TTCTATATGA TAGGAATAAA GCAAAGCGCG AGTGTGCTGT AAAAGTTTTC 8880 
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GATGATGTAT 
GAATTATTTT 
ATGAATGGGA 
CAAAAAATTC 
TTTACAGTGA 
TCATTTCATA 
GCAAGTAAAA 
GCATTTTTAA 
GTACGTATTA 
GGTTTACTGG 
AATATCAATA 
GTCATCGTTG 
AGTATTGTAA 
tTAATTTTAA 
AAAAACATAG 
CATTATTCAT 
GCGATTGATC 
CAAATTGAAA 
ATAAAACATT 
CTTATGTTGC 

cgacatcttt 
ctatgatgta 

TTGCTACGAC 
TACCCTAACA 
TTAAAAAAAT 
CATtGTTATA 
TTGCATCAAC 
CAACATTAAT 
CAAATGACGA 



AAATCATGGT 
TAAAAGCGAC 
AAAAGGCGAA 
AACAAAGTTC 
TTGAATTTGT 
TGCTTAGTGA 
AGCCGACTGC 
ATGGTTTAGC 
TTTATCCGCA 
TCAATATTAT 
TTCAAAGTGC 
CAGTTGTATT 
TTTCACTCAT 
TGGAAAGTGT 
ATGGCATATT 
TAAGTGCCCA 
AAGTATCATC 
ACTTGCAATT 
GTAGCGCCTA 
ATCATTTAAA 
AGGTTTCAAA 
CCTTTGACCG 
AGATTCTTTA 
TGATTTTTAT 
ATCTTAATAT 
GGAGGTCTTA 
GAAAGAAGAA 
TGAAGTACAA 
aGCAGAAGCT 



TAATTACGGA 
AATATTAAAT 
TACGATAAAC 
TAAAAAGACG 
CGGAGGTTTA 
TGTATTAGCA 
ACGATACACA 
ATTAATTGTA 
ACCAATTGAA 
TTTGACTGTT 
ATT ATGG CAT 
GATTTACTTT 
CATTTTACGT 
GCCTCAACAT 
AGATGTACAT 
TGTTGTGTTA 
ATTGTTGAAA 
GAATCCATTA 
AAACATTAAT 
TGATTTTCGT 
ATATGAATAT 
GCCATTGTTT 
TCCATAATGA 
ACTCTTTGAA 
CCTTGTAATC 
TTAATGACAT 
CTAGAAGCAA 
GCTACTGAAA 
AAACAATTTT 



AGCATTAATA TTAACCTGAG 
ACGACGCATT TATTTAGGAG 
AGATACAAAT ATTTTCATCA 
CTGTGGGCAT CACTAATCAT 
GTATCTAATt CATTGGCATT 
CTTGGTTTAT CTATGTTGGC 
TTTGGATATT TAAGATTTGA 
ATTTCAATCT GGATTTTATA 
AGTGGCATTA TGTTTATGAT 
ATC CTTGTAA GGTCTTTAAA 
TTCATGGGAG ACTTATTGAA 
ACAGGATGGC GCATCATCGA 
GGTGGTTATA AAATTACGCG 
TTGGATACTG ATCAAATTAT 
GAATTTCATT TGTGGAGTAT 
GATAAAAAAT ATGAGGGTGA 
GAAAAATATG GCATTGCACA 
GATGAGCCAT ACTTCGACAA 
CTATGTCATA GGCGCACGTT 
CAATTTCTTT GATGCTATCT 
GTTTTTCATC ATTTGTATGT 
CTACAGCAAT CTTTTTGTTT 
TAGCCCCCTA TATATATGTT 
AATATATTTT ACAGAATTTT 
CGATAAGAAT TATAGTAATA 
TATTTTTATT AGAAGCTAAC 
AGGCAGCATC ACTATCTACG 
ATTTAACTCA TGGTTATTTT 
TAACAGAAGC AGATATTAGT 



AAG CT AT AAA 
TGGCAAACGT 
TGTCAATCAT 
CACATTGTTA 
ACTGTCAGAT 
CATTTATTTT 
GATATTAGCT 
TGAAGCTATT 
TGCTAGTATT 
ACAAGAAGAC 
CTCTATTGGT 
CCCAATCATT 
TAATGCgTGG 
GGCAGATATT 
TACAACAGAG 
TGATTATCAA 
TTCAACGTTG 
ATTAACATAA 
TCGTTTTATA 
ACATCTAACA 
AAAATGCGTT 
CTAGCTAAAC 
TATTTACTTA 
ATCTAAATAT 
TTTTTTCAAC 
AATCTTGATT 
AAGACAATTC 
ATTGTGGAAG 
ATTCAATTAG 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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TTGATTACCT TGTAACTTGG AACATTCCGG AAGGCATTAC GATGGATCAA TATTTAGCAC 10800 

GTAAAAAGAA AAATTCTGTT CATTATGAAG AAGTGCCAGA AGTTGAATTT AAACGCACAT 10 860 

ATGTATGTGA AGATATGTCT AAATGTATTT GTTTATACAA CG CACCTG AT GAAGAAGCGG 10 920 

TACGTCGCGC GCGCAAAGCA GTTGATACAC CGATTGATGG CATCGAAAAA CTTTAATAAG 10980 

ACAACAAGTT GATGAGATAT ATGTATATAG GTTTGGCATG GATTTCGATT GCAGTTAATT 11040 

AGAATAGCTC AATGCTATAA ATGTAAGTAG TTGATATGAA GAAACTAATG AACTAAATGC 11100 

AAGTATTGTC TAAAACAATC ATTTTATTGA AATTTAGTAG AGCTGAAATT AATATAACGT 11160 

CGTTAATTGA ATAACGCTTA TGTTATAAGA GCACTCATAC CAAACCATAA TCATCTATAG 11220 

ATATAACAAT TCACGATATA AGGGCTGTGT TTGGCATAGC CCTTTAGATA TACACTTAAT 11280 

TCCTATTAAA ATAGTAGGGA TTAAAAGGGG GCTTGTCATG ATTAAAATTC AACAATTACA 11340 

20 ACATCACTTT GGATCACATA AAGTAATTCA TAACTTTAAT TTGGACATTA GCAAGGGAGA 11400 

AATAGTCACT TTCATAGGGA AAAGTGGTTG CGGAAAGTCT ACTTTACTCA ATATTATCGG 114 60 

TGGATTTATT CATCCATCGT CTGGTCGTGT CATTATTGAT AACGAAATTA AACAACAGCC 11520 

25 ATCTCCAGAT TGTTTAATGC TATTTCAACA TCATAATTTG CTGCCATGGA AAACGATTAA 11580 

TGACAACATT AGGATTGGAT TACAACAGAA AATTAGTGAT GAAGAGATTA ACGCACAGCT 11640 

TAAATTAGTT GATTTAGAAG ACAGGGGAAA GCATTTTCCC GAGCAACTGT CCGGGGGTAT 11700 

30 GAAACAACGT GTGGCACTAT GTCGAGCGCA TGTG CAT AAG CCTAACGTTA TATTGATGGA 11760 

TGAGCCATTA GGTGCATTAG ATGCATTTAC ACGTTATAAA CTTCAGGATC AACTAGTGCA 11820 

aCTAAAACAT AAAACGCAAT CAACTATTAT TTTAGTGACG CATGACATTG ATGAAGCTAT 118 80 

TTATCTTTCC GACCGCATTG TTCTGTTAGG TGAAGGGTGC AATATTATTT CTCAATATGA 11940 

AATTACAGCA TCACATCCAC GCAGTCGTAA TGATAGCCAC CTACTTAAGA TTCGTAATGA 12000 

AATTATGGAA ACATTTGCAT TGAATCATCA TCAAGTTGAA CCTGAATATT ATTTATAAGG 12060 

AGTGAGTGAC GATGAAAAGG TTAAGCATAA TCGTCATCAT TGGAATCTTT ATAATTACAG 12120 

GATGTGATTG GCAAAGGACG TCTAAAGAAC GGTCTAAAAA TGCCCAAAAT CAGCAAGTGA 12180 

TTAAAATTGG ATATTTGCCG ATTACACATT CAGCTAATTT GATGATGACT AAAAAATTAT 1224 0 

TATCACAATA CAATCATCCG AAATATAAAC TAGAATTAGT TAAATTCAAT AATTGGCCAG 123 00 

ATTTAATGGA CGCATTAAAC AGTGGTCGTA TTGATGGTGC ATCAACTTTA ATAGAGCTAG 12360 

CGATGAAATC AAAACAGAAG GGCTCAAATA TAAAGGCTGT GGCATTGGGC CATCATGAAG 12420 

GCAATGTCAT TATGGGACAA AAAGGTATGC ACTTAAATGA ATTTAATAAT AATGGCGATG 12480 
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GTAAACAATT AAAGATTAAA CCGGGGCATT TTAGCTATCA TGAAATGTCG CCAGCAGAAA 
TGCCAGCCGC ATTGAGTGAA CACAGAATTA CAGGGTATTC TGTAGCCGAA CCATTCGGTG 
CACTGGGTGA AAAGTTAGGC AAAGGTAAGA CTTTGAAACA TGGTGATGAC GTTATACCTG 
ATGCGTATTG CTGTGTGCTA GTACTGAGAG GGGAATTGCT TGATCAACAC AAGGATGTAG 
CGCAAgCATT TGTACAAGAT TATAAAAAGT CTGGCTTTAA AATGAATGAT CGCAAGCAAA 
GTGTAGACAT T ATGACG CAT CATTTTAAAC AAAGTCGTGA CGTTTTAACA CAGTCAGCGG 
CATGGACATC CTATGGTGAT TTAACAATTA AGCCATCCGG CTATCAAGAA ATTACGACAT 
TGGTAAAACA ACATCATTTG TTTAATCCAC CTGCATATGA TGACTTTGTT GAACCGTCAT 
TGTATAAGGA GG CATCGCGT TCATGACACG TCCCACAAAT AACAAATTTA TATTACCTAT 
TATCACATTT ATTATTTTCT TAGGCATTTG GGAAATGGTC ATTATTATTG GGCATTACCA 
AC CTGTATTG TTAC CGGGTC CTGCTCTTGT AGGAAAAAGT ATATGGTCTT TCATTGTTAC 
TGGAGAAATT TTCCAACATT TAG CAATT AG TTTATGGAGA TTTGTAGCGG GCTTTGTTGT 
CGCATTGTTG GTTGCTATTC CATTGGGCTT CTTGCTTGGA AGGAATCGTT GGCTATACAA 
CGCTATCGAA CCGCTATTTC AATTGATTAG GCCGATATCT CCGATAGCAT GGGCACCATT 
TGTTGTTCTA TGGTTTGGTA TTGGTAGTTT G C CAG CG ATT GCGATTATTT TTATCGCTGC 
TTTTTTCCCA ATTGTGTTCA ATACTATTAA AGGCGTTAGA GACATTGAAC CTCAATATTT 
AAAAATAGCA GCAAATTTAA ATTTAACTGG GTGGTCATTG TATCGCAATA TATTATTTCC 
CGGGGCATTT AAACAAATCA TGGCTGGGAT ACATATGGCG GTAGGAACAA GTTGGATATT 
TTTAGTTTCT GGTGAAATGA TTGGTGCACA ATCGGGATTA GGTTTTTTAA TCGTTGATGC 
ACGAAATATG TTGAACTTAG AAGATGTTTT AGCAG CAATA TTCTTTATCG GATTATTTGG 
TTTTATTATT GATCGATTCA TTAGTTATAT TGAGCAGTTT ATACTTAGAA GATTTGGTGA 
ATAAGGAGAG ATGATGATGA CTTTAGAAAC GCTTATCAAA GAACAATTAG ATCCTCATTT 
AGTAGAAGTT GATGAAGGGA CGTATTATCC GAGAACATTT ATTCAGCAAT TATTTGTAGA 
TGGTTATTTC GGTGAGGCGG CATTGAGAAA AAATGCTGAA GTAATCGAAG CTGTATCGCA 
GTCTTGTTTG ACAACAGGAT TTTGTTTATG GTGCCAATTA GCTTTTTCAA CGTATTTAGA 
AAATGCCACG CAGCCACATT TAAATAATGA CTTACAACAG CAATTGTTAT CTGGAGAAAT 
ATTAGGTGCT ACCGGATTGT CTAATC CGAT GAAGTCATTT AATGATTTAG AAAAGTTGAA 
CCTTGAACAC ACTTATGTTG ATGGACAATT GGTTGTCAGT GGACGTATGC CAGCTGTAAG 
TAATATTCAA GAAGACCATT ATTTTGGTGC GATTTCGAAA CATGAATCAT CAGATGAATT 



12600 

12660 

12720 

12780 

12840 

12900 

12960 

13020 

13080 

13140 

13200 

13260 

13320 

13380 

13440 

13500 

13560 

13620 

13680 

13740 

13800 

13860 

13920 

13980 

14040 

14100 

14160 

14220 

14280 



55 



BNSOOCIO: <EP_0766519A2J_> 



382 



EP0 786 519 A2 



10 



15 



20 



25 



30 



35 



40 



45 



SO 



TTTAGGAGTC 
ATCACAAATT 
TGCTTACCAA 
ATTTTCAAAT 
AAAACGTTAT 
TTCACATTTA 
AAATCAAGCT 
TCGCAAGTTA 
TAAACTTGAA 
TTGTTGAAAC 
GTT AT AT CCT 
AAAAGTGTTA 
GAAGGTGAAT 
TTATCGTAAA 
TTGGGCGATT 
CAAACGCAAT 
AGAAATAGAT 
AGATATTGAT 
ATAAGTGTTG 
ATAATTCAAG 
CAAC2ATGCAA 
AAATCGTAGC 
AAAAAGCATT 
CAGATAATGG 
CATTTTATCC 
CAATTACAGC 
TTGGTGTCGT 
ATTTATTAAT 
GGAACTTGTC 



AACGGGTCAG 
ATCACGCATG 
ATTCCAATAG 
GTGCAAAACG 
CGTCAACTTA 
AATGAATTAA 
TCTGTTGTCA 
AAAGAAGGAT 
GCAGAGTTGA 
ATTTTTTAAA 
TTTAACTAGG 
ATAAGGTGTA 
ACTATAGATA 
GAATACGTGG 
GCACATATGA 
GAATCGATGC 
GTAAATGAGA 
TTGAATGATG 
CTGGTGTAAG 
GGGGTGGTAT 
CACGTACTTT 
ATTATTTCCA 
AGGATTAAAA 
TGAAGACTTA 
TG CAT AT ATG 
AGGTGTAGGA 
TGAAGTTACA 
ACTTCTTAGA 
TCAAGTAGGT 



CAACGTATCA 
ATGCGAAGCA 
GATTAGGCTC 
GAATAAATCA 
GAGAGGAATA 
TATCATTGAA 
ATGGTGGTTC 
TCTTCTTCGC 
AGGGGTAAGT 
ATAATATAAA 
AAAATATACA 
TAATGAAAAT 
CGCATACTAA 
GTAAAGGACC 
CAGGTGTTTT 
TCCATTATAC 
TGGAAAGTCT 
ATGAAGTCAT 
GTACACGGTG 
GTCAAACGGT 
AAGGAAGTCA 
GAAGCAGTAG 
ACATTTTTAG 
GATAAACATT 
ACTCGTGAAC 
TCTGACCATG 
GGAAGTAATA 
AACTATGAAG 
AATCATGCGC 



AATCACATTG 
GTTTGCGGCA 
AATTAAAAGT 
ATATTTAGAG 
TTATGCAATA 
GAAGGACATC 
TAGAGCGTAC 
AGCATTGACA 
GTGATAAGCT 
TCTTAGTTTA 
TTTCGTAATA 
GTGAACAATT 
AGAACAACAA 
CAATAGTATT 
GAGTAAAGTT 
ACGCACAGAG 
TGTAGGCGCT 
TTCAATATTT 
CTGTTTGCTA 
GCCGTTTTTT 
AAATTTATCA 
AAGGTCAAGA 
AGGAAAGAGG 
TACCAGATAT 
GTATTGAAAA 
TAGATTTAGC 
CAGTTAGTGT 
AAGGTCATCG 
ATGAATTACA 



AATCAAGTCG 
ACTATTCGCC 
TCTTTAGAGT 
TATGATGTTG 
TTAGATGACG 
GG CTATTTAT 
ACACCATATT 
CCGACATTAA 
GATTTTTTGT 
TAAACATTTT 
ATAATAATCG 
AATGAACTTC 
TTCTCGAATC 
CGAGTGTCGT 
GAGAGTTTTT 
AAGATTAAAC 
AAGTTTGTAA 
GTTTTCGATA 
ACTTCGCTTT 
TGTCATATTT 
TTTAGGAGAG 
AAATCAATTA 
ACATGAGTTC 
GGATGTGATT 
AGCACCGAAC 
GGCAGCAAGT 
GGCAGAACAT 
TCAATCAGTA 
ACACAAAACA 



TAGTGCCACA 
CGCAATTTAT 
TAATTGATGC 
AAGCTTTTAA 
GTAACTTAAC 
TGTTAGATGT 
CGCCACAAGT 
GACATTTAGG 
TTAGATGCGT 
CTGTTAATTT 
TTATCATTGA 
TTATTTTAAA 
TAGTAAG ATC 
TTAAAGATAA 
ACCTAAACGA 
AGATGTATAA 
AATTATTTAC 
AGTCAATAGA 
GAATTTAACA 
TTAAAACAAG 
ATGGATATGA 
CTTAATACTA 
ATTATATTAG 
ATTAGTGCGC 
TTGAAATTAG 
GAACACAATA 
GCGGTTATGG 
GAAGGTGAAT 
ATTGGTATTT 



14400 
14460 
14520 
14580 
14640 
14700 
14760 
14820 
14880 
14940 
15000 
15060 
15120 
15180 
15240 
153 00 
15360 
15420 
15480 
15540 
15600 
15660 
15720 
15780 
15840 
15900 
15960 
16020 
16080 
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TACAACACTA TGATCCAATC AATCAACAAG ACCATAAATT GTCTAAATTT GTAAGCTTTG 16200 
ATGAACTTGT TTCAACAAGT GATGCGATTA CAATTCATGC ACCATTAACA CCAGAAACTG 16260 

5 AT AAC TT ATT TGATAAAGAT GTTTTAAGTC GTATGAAAAA ACACAGTTAT TTAGTGAATA 16320 
CTGCACGTGG TAAAATTGTA AATCGCGATG CGTTAGTTGA AGCGTTAgCA TCCGAGCATT 
TACAAGGATA TGCTGGTGAT GTTTGGTATC CaCAACCtGC ACCTGCTGAT CATCCATGGA 

10 GAACAATGCC TAGAAATGCT ATGACGGTTC ACTATTCAGG TATGACTTTA GAAGCACAAA 16500 

AACGTATTGA AGATGGAGTT AAAGATATTT TAGAGCGTTT CTTCAATCAT GAACCTTTC C 
AAGATAAAGA TATTATTGTT GCAAGTGGTC GTATTGCTAG TAAAAGTTAT ACAGCTAAAT 

15 AGAATAAGGA TGCTGGGCTA GCGATTAACG CTTTCAATTT TATATAAATG AAT CAT AT AA 16680 

GCACTACTGC TGTTGTAAAG ATGGCAGTAG TTTTTTTATG ATTACATCTA AGTATAGTCA 
CGGCTATGTT AGGACAATGA TTTAACATTT ACGCACATAT GTGTTCACTT ACG CAATT AT 

20 

TGAnAAATnT CATTCATGTG GnAATC 

(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 4012 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

30 



16380 
16440 



16560 
16620 



16740 
16800 
16826 



(xi> SEQUENCE DESCRIPTION: SEQ ID NO:' 47: 

TTCAATGAGA GTAGTGGGCT GATGTTTAGC GATATCGCGT AAGATTAACC ATTGGCCATA 60 

35 ATATATATTG TGTTTTTCTA AAATCGGCTC GGCTAATTTT AAATAGGGGC GATATATTGT 120 

TATAAAACTA TTGAAAAATT CTTGTGATAG CATAGTGACA TCTCCTAAGA CAAAAT AGTT 180 

AG CTTAGCTA mCCTTTTTAC AACAATAGTA ATTATAAAAC GGGAGCAATT AGAAATCAAT 240 

40 ATATAATTAT TAAGAGCAAA AATAATTATA CTTTGTTAAA ATAAGCGTAA TTACATGTAA 300 

ATAGGGGGAT ACTAATGATA TTGAAATTTG aTCACATCAT TCATTATATA GATCAGTTAG 360 

ATCGGTTTAG TTTTCCAGGA GATGTTATAA AATTACATTC AGGTGGGTAT CATCATAAAT 420 

45 ATGGAACATT CAATAAATTA GGTTATATCA ATG AAAATTA TATTGAGCTA CTAGATGTAG 480 

AAAATAATGA AAAGTTGAAA AAGATGGCAA AAACGATAGA mGGCGGAGTC GCTTTTGCTA 54 0 

CTCAAATTGT TCAAGAGAAG TATGAGCAAG GCTTTAAAAA TATTTGTTTG CGTACAAATG 600 

50 ATATAGAGGC AGTTAAAAAT AAACTACAAA GTGAGCAGGT TGAAGTAGTA GGGCCGATTC 660 
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ATCAGGATGA TGATGAAATT AAGCCACCAT TTTTTATTCA ATGGGAAGAA AGTGATTCCA 780 

TGCGTACTAA AAAATTGCAA AAATATTTTC AAAAACAATT TTCAATTGAA ACTGTTATTG 84 0 

S TGAAAAGTAA AAACCGATCA CAAACAGTAT CGAATTGGTT GAAATGGTTT GATATGGACA 900 

TTGTAGAAGA GAATGACCAT TACACAGATT TGATTTTAAA AAATGATGAT ATTTATTTTA 960 

GAATTGAAGA TGGTAAAGTT TCAAAATATC ATTCGGTTAT CATAAAAGAC GCACAAGCAA 1020 

W CTTCACCATA TTCAATTTTT ATCAGAGGTG CTATTTATCG CTTTGAACCA TTAGTATAAA X080 

TATACGTAAG TGCTATGAGC GAGAATGCCC ATATGAATAA TGACAAGCAC AATGGAAAGA 114 0 

ATCGTTAATA TATTATTTAA TCGTGATGAC TTAATTAAAA TGAAAAAGAT TGATAATATA 1200 

15 

AATGTGAAAA AGATAAGTAT AACCCGTAAA CTAAAGTAAT TCACGGTGAG AGGTTGACTC 1260 

AATGTCATAA TGATTGCAAC GATGTTCATA ATTATAAATA GACTTAAAAT AATTGTTCTC 1320 

AT AT CAAACA CCTCATTGTT AGATTATTGA CATTATAACA GGGGTAATTG TATATGAACA 13 80 

20 

TTAATGTGGT TGCTTGAGGA AAAATTTATT CATTGAAGTC AAGTTGGTTC ATTTTAGAAA 144 0 

TGAATATCGT GTTAGATGAT GAAAGTATAT TGAAGTATAG GTAACTAGTT GAAAAGTATT 1500 

AATTGTACGA TAACATTAAA TTTAACACGA AACATAGATA TAAAATGATT CACAATTAAA 1560 
25 m 

ATGGGTAAAT TTGAACTTGC TAAACTATTA ATTGGAGCAT GGACATTTCA AAAATAAGAG 1620 

TTCAAATCTT ACACAAGCTC TGAATCGACA CTATAAGATA CAAACTGTAT AATTAAAGGT 1680 

30 ATTGTTAAAT AGAAGGAGAT ATCATAAATC ATGGAAAAGA TGCATATCAC TAATCAGGAA 174 0 

CATGACGCAT TTGTTAAATC CCACCCAAAT GGAGATTTAT TACAATTAAC GAAATGGGCA 1800 

GAAACAAAGA AATTAACTGG ATGGTACGCG CGAAGAATCG CTGTAGGTCG TGACGGTGAA 1860 

35 GTTCAGGGTG TTGCGCAGTT ACTTTTTAAA AAAGT AC CT A AATTACCTTA TACGCTATGT 1920 

TATMTTCGC GTGGTTTTGT TGTTGATTAT AGTAATAAAG AAGCGTTAAA TGCATTGTTA 1980 

GACAGTGCAA AAGAAATTGC TAAAGCTGAG AAAGCGTATG CAATTAAAAT CGATCCTGAT 204 0 

40 GTTGAAGTTG ATAAAGGTAC AGATGCTTTG CAAAATTTGA AAGCGCTTGG TTTTAAACAT 2100 

AAAGGATTTA AAGAAGGTTT ATCAAAAGAC TACATCCAAC CACGTATGAC TATGATTACA 2160 

CCAATTGATA AAAATGATGA TGAGTTATTA AATAGTTTTG AACG CCGAAA TCGTTCAAAA 2220 

45 GTGCGCTTGG CTTTAAAGCG AGGTACGACA GTAGAACGAT CTGATAGAGA AGGTTTAAAA 2280 

ACATTTGCTG AGTTAATGAA AATCACTGGG GAACGCGATG GCTTCTTAAC GCGTGATATT 234 0 

AGTTACTTTG AAAATATTTA TGATGCGTTG CATGAAGATG GAGATGCTGA ACTATTTTTA 24 00 

SO 

GTAAAGTTGG ATCCAAAAGA AAATATAGCG AAAGT AAATC AAGAATTGAA TGAACTTCAT 24 60 
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CAAAATATGA TTAATGATGC GCAAAATAAA ATTGCTAAAA ATGAAGATTT AAAACGAGAC 25 80 

CTAGAAGCTT TAGAAAAGGA ACATCCTGAA GGTATTTATC TTTCTGGTGC ACTATTAATG 264 0 

TTTGCTGGCT C AAAAT CAT A TTACTTATAT GGTGCGTCTT CTAATGAATT TAGAGATTTT 2700 

TTACCAAATC ATCATATGCA GTATACGATG ATGAAGTATG CACGTGAACA TGGTGCAACA 2760 

ACTTACGATT TCGGTGGTAC AGATAATGAT CCAGATAAAG ACTCAGAACA TTATGGATTA 2820 

TGGGCATTTA AAAAAGTGTG GGGAACATAC TTAAGTGAAA AGATTGGTGA ATTTGATTAT 2880 

GTATTGAATC AGCCATTGTA CCAATTAATT GAG CAAGTT A AACCGCGTTT AACAAAAGCT 2940 

AAAATTAAAA TATCTCGTAA ATTAAAACGA AAATAGATTA ACGACTGAAA TCTGAACGCT 3000 

CATAAGACTG TCATTTGCGT TCAGATTTTT TTACACAATA TAGAATGGTT GAGTAAAATA 3060 

TTTTTGAATA TAGTGAAAGA GGGGGAAGTA CTGTGATAAA AAAGCTATTA CAATTTTCTT 3120 

TAGGGAATAA GTTTGCTATC TTTTTAATGG TTGTTTTAGT TGTCTTGGGC GGTGTATATG 3180 

CGAGTGCTAA ATTGAAATTA GAATTACTAC CAAATGTACA AAATCCAGTT ATTTCAGTTA 324 0 

CAACAACAAT GCCGGGTGCA ACGCCACAAA GTACCCAAGA TGAAATAAGT AGTAAAATTG 3300 

25 ACAATCAAGT AAGATCATTG GCATATGTGA AAAATGTTAA AACGCAATCC ATACAAAATG 3360 

CTTCAATTGT AACAGTTGAA TATGAAAATA ATACAGATAT GGATAAAGCA G AAGAACAG C 3420 

TTAAAAAAGA AATCGATAAA ATTAAATTTA AAGATGAAGT TGGTCAACCA GAATTAAGAC 3480 

30 GTAATTCGAT GGATGCTTTT CCGGTTTTAG CATATTCATT TTCAAATAAA GAGAATGACT 354 0 

TGAAAAAAGT AACGAAAGTA CTGAATGAAC AATT AAT AC C AAAATTGCAA ACGGTAGATG 3600 

GTGTGCAAAA TGCGCAATTA AATGGGCAGA CGAACCGTGA AATCACCCTT AAATTTAAGC 3660 

AAAATGAACT TGAAAAATAT GGGTTGACTG CTGATGATGT AGAAAACTAT CTAAAAACGG 3720 

CAACfiAGAAC AACGCCACTT GGATTGTTCC AATTTGGTGA TAAAGATAAT CAATTGTTGT 3780 

TGATGGTCAA TATCAATCTG TTGATGCTTT TAAAAACATA AATATTCCAT TAACGTGGCA 3840 

GGAGGACCAA GGGCATCTCA TCCCAAAGTG ACCATAAACC AAATTCAGCC ATGTCAGACG 3900 

TTATCAGG CA - TCACCACAGC AAATTCAAAG CGTCAGCnCC AATATATAGT GGATGCCGCA 3960 

nGAACTAGGG GTTTAGCGnT ATCAGTGGTG TGGCGACTCT ATTCTAAACG AT 4012 
(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7778 base pairs 

(B) TYPE: nucleic acid 
50 (C) STRAND EDN ESS : double 

(D) TOPOLOGY: linear 
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<xi) 


SEQUENCE DESCRIPTION: SEQ ID NO: 


48 : 








CAATATAGGT CGCCGAGTTT CAACTa CATC AACTGGTTCA 


. GTTACATTAG 


AT AATG CGCT 


60 


5 


AGGTGTAGGT 


' GGCTATCCTA AAGGACGAAT TATTGAAATT 


TATGGTCCTG 


AAAGTTCTGG 


120 




TAAGACAACA 


GTAGCGCTTC ACGCTATTGC TGAAGTACAA 


AGTAATGGCG 


GGGTGGCAGC 


180 


10 


ATTTATCGAT 


GCTGAACATG CTTTAGATCC AGAATATGCT 


CAAGCATTAG 


GCGTAGATAT 


240 


CGATAATTTA 


TATTTATCGC AACCGGATCA TGGTRAAraa 


GGTCTTGAAA 


TCGCCGAAGC 


300 




ATTTGTTAGA 


AGTGGTGCAG TTGATATTGT AGTTGTAGAP 


TCAGTTGCTG 


CTTTAACACC 


360' 


15 


TAAAGCTGAA 


AXTGAAGGAG AAATGGGAGA CACTCArnrr 


GGTTTACAAG 


CTCGTTTAAT 


420 




GTCACAAGCG 


* * w * x ^ X w1\J\l A W> X J\ X X X w X /W^/^ 


TCAAATACAA 


CTGCTATTTT 


480 




CATCAACCAA 


ATTCGTGAAA AAGTTGGTRT TATrtTTTfirtT 


AATCCAGAGA 


CTACACCAGG 


540 


20 


TGGACGTGCA 


TTAAAATTCT ATAOTT(* , A.fST A A o A pTar: a a 


GTACGTCGTG 


CAGAACAGCT 


600 




TAAACAAGGA 


CAAG AAATT^ T A GGTAATAfl A A pt a a a a tt 


AAAGTCGTTA 


AAAATAAAGT 


660 




GGCACCACCA 


TTTAGAGTAG CTGAAGTTCA TATTATGTAT 


GGACAAGGTA 


TTTCTAAAGA 


720 


25 


GGGTGAACTT 


ATTGATTTAG GTGTTGAAAA CGAf*ATroTT 


GaTAAATCAG 


GAGCATGGTA 


780 




TTCTTACAAT 


GGCG AACGAA TGGGTCAAGG TAAflGAAAAT 


GTTAAAATGT 


ACTTGAAAGA 


840 




AAATCCACAA 


ATTAAAGAAG AAATTGATCG TAAATTHAGA 


GAAAAATTAG 


GTATATCTGA 


900 


30 


TGGTGATGTT 


GAAGAAACAG AAGATGCACC AAAGTCATTA 


TTTGACGAAG 


AATAGTACAC 


960 




AAATTTATAT 


CTATAGTTAA ACTTAGCAAA TATCCTTATA 


GGATTGATTG 


AAAGTGATAT 


1020 




TCATCTCATA 


AAGCTAGAAT AATATCTAAC TTTATGGGAT 


ACACTACAAA 


TCGAGACTAT 


1080 


35 


AAGGTTTTTT 


ATTTTATTTA TTATTACATT ATCAATAGTT 


TTATAATCGA 


GCTTCAAAAC 


1140 




TTTA6AAAAT 


AGTAGAAATA GCATTCAATA TAGTGCAAAA 


GTGCAAATTG 


ATAACTTGAC 


1200 


40 


ACTTATCTCC 


TATAAACCGT ACAATTAATT TGTATGATTT 


ATATATAATT 


TCATAAAGTC 


1260 


ATATTGAATT 


TCATATAAAG AGCAAACCCT AGAAAAGGAG 


GTGTTTGTGT 


GAATTTATTA 


1320 






XCATTTTGCT GGGGATCATT CTAGGAGTTG 


TTGGAGGGTA 


TGTTGTTGCC 


1380 


45 


CGAAATTTGT 


TGCTTCAAAA GCAATCACAA GCTAGACAAA 


CTGCCGAAGA 


TATTGTAAAT 


1440 




CAAGCACATA 


AAGAAGCTGA CAATATCAAA AAAGAGAAAT 


TACTTGAGGC 


AAAAGAAGAA 


1500 




AACCAAATCC 


TAAGAGAACA AACTGAAGCA GAACTACGAG 


AAAGACGTAG 


CGAACTTCAA 


1560 


50 


AGACAAGAAA 


CCCGACTTCT TCAAAAAGAA GAAAACTTAG 


AGCGCAAATC 


TGATCTATTA 


1620 




GATAAAAAAG 


ATGAGATTTT AGAGCAAAAA GAATCAAAAA 


TTGAAGAAAA ACAACAACAA 


1680 



55 



BNSOOCID: <£P 07865 19A2J_> 



387 



EP0 786 519 A2 

CGCATCTCCG GTCTCACTCA AGAAGAAGCT ATTAATGAGC AACTTCAAAG AGTAGAGGAA 1800 

GAACTGTCAC AAGATATTGC AGTACTTGTT AAAGAAAAAG AAAAAGAAGC TAAAGAAAAA 1860 

5 

GTTGATAAAA CAGCAAAAGA ATTATTAGCT ACAGCAGTAC AAAGATTAGC AGCAGATCAC 1920 

ACAAGTGAAT CAACGGTATC AGTAGTTAAC TTACCTAATG ATGAGATGAA AGGTCGAATC 1980 

ATTGGACGAG AAGGACGAAA CATCCGCACA CTTGAAACTT TAACTGGCAT TGATTTAATT 2040 

10 

ATTGATGACA CACCAGAAGC GGTTATATTA TCTGGTTTTG ATCCAATAAG AAGAGAAATT 2100 

GCTAGAACAG CACTTGTTAA CTTAGTATCT GATGGACGTA TTCATCCAGG TAGAATTGAA 2160 

GATATGGTCG AAAAAGCTAG AAAAGAAGTA GACGATATTA TTAGAGAAGC AGGTGAACAA 2220 

15 

GCTACATTTG AAGTGAACGC ACATAATATG CATCCTGACT TAGTAAAAAT TGTAGGGCGT 2280 

TTAAACTATC GT ACGAGTT A CGGTCAAAAT GTACTTAAAC ATTCAATTGA AGTTG CG CAT 234 0 

2Q CTTGCTAGTA TGTTAGCTGC TGAGCTAGGC GAAGATGAGA CATTAGCGAA ACGAGCTGGA 2400 

CTTTTACATG ATGTTGGTAA AGCAATTGAT CATGAAGTAG AAGGTAGTCA TGTTGAAATC 2460 

GGTGTAGAAT TAGCGAAAAA ATATGGTGAA AATGAAACAG TT ATT AATG C AATCCATTCT 2520 

25 CATCATGGTG ATGTTGAACC TACATCTATT ATATCTATCC TTGTTGCTGC TGCAGATGCA 2580 

TTGTCTGCGG CTCGTCCAGG TGCAAGAAAA GAAACATTAG AGAATTATAT TCGTCGATTA 264 0 

GAACGTTTAG AAACGTTATC AGAAAGTTAT GATGGTGTAG AAAAAGCATT TGCGATTCAG 2700 

30 GCAGGTAGAG AAATCCGAGT GATTGTATCT CCTGAAGAAA TTGATGATTT AAAATCTTAT 2760 

CGATTGGCTA GAGATATTAA AAATCAGATT GAAGATGAAT TACAATATCC TGGTCATATC 2 820 

AAGGTGACAG TTGTTCGAGA GACTAGAGCA GTAGAATATG CGAAATAATT TTTGTCTCCC 28 80 

35 TCACAAATTA GTGAGGGAGC TTTTTTAAGT TGTAGTCTTA At CTAGTTAG ACAGCACTTT 2940 

ATCXX5TAATA ACTATATTAA ACAGTAGTTA TTTGAAAGTA AGAOGGACCT TATATTAAAT 3000 

AAGAAGTTAT TGCTTTTAAT AAAAATGTTT TAGGCTTCGT AATTACTATA TTTATATTAT 3 060 

40 

GTAAACCTAT AAAGATGATT G G TTT T CTAT CCAATAAAAA AGAAGAGAAG ATGTAACACA 3120 

TCTTCTCTTC; yGCAATATTA ATTAGGATTT ATTTCTAAGT TGAGTTATTT TAATTGTAAA 3180 

TCTGTTTTCT TTAATTCTTT TATAACTTCT GCAGTATCAT AACAATTTGT TGCAATTGTT 3240 

45 

GAATATCTCT CTGCTAAACG ATATGCATTA ATGTAAAGCT TTAAACTTTC TTTAGCTATA 3300 

TCCTCTGCAT CTTCGAATTT TGATGGGTTA GACATAACCA CTAATTCTGC AAATTTTTCT 3 360 

GGATCAATAT TAATAGACAT GTATTTATTT ACAACTCCTA TTTATTTTGA TGTCTTAATA 3420 

SO 

CTAACATATT GAAGTTTTCA GACAAAGTAA TGTCTCTCTA TAATTGAAGA AAAATAATTC 3480 
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GGATGAACAA AACATGAGAA TAATGTTTAT AGGGGATATC GTAGGTAAAA TTGGACGAGA 3 600 

CGCAATTGAA ACGTACATAC CTCAACTGAA GCAAAAGTAT AAACCAACAG TTACAATTGT 3660 

5 AAATGCTGAA AATGCAGCAC ATGGTAAAGG TTTGACTGAA AAAATATATA AACAATTACT 3720 

AAGAAATGGT GTAGATTTCA TGACTATGGG TAATCACACA TATGGTCAAC GTGAAATTTA 3780 

TGATTTTATA GATGAAGCAA AACGACTAGT AAGACCAGCG AATTTTCCGG ATGAAGCGCC 3840 

10 

GGGAATTGGT ATGAGATTTA TACAAATTAA TGATATTAAA CTTGCAGTTA TTAATCTGCA 3 900 

AGGAAGAGCG TTTATGCCAG ATATTGATGA TCCTTTTAAA AAGGCAGATC AATTAGTCAA 3 960 

GGAAGCACAA GAACAAACTC CGTTTATATT TGTTGATTTT CATGCAGAAA CAACTTCTGA 4 020 

15 

AAAGTATGCA ATGGGATGGC ATTTAGATGG TAGAsTAGCG CTGTTGTTGG AACGCATACA 4080 

CACATTCAAA CAGCAGATGA ACGTATTTTA CCAAAGGGGA CAGGGTATAT AACGGATGTT 4140 

GGTATGACAG GTTTTTATGA TGGCATTTTA GGAATAAATA AAACAGAGGT AATTGAGCGT 4200 

20 

TTTATCACTA GTTTGCCAGA AAGAGATGTT GTTCCAAATG AAGGTAGAAG TGTATTATCT 4 260 

GGTGTTGTTA TTGATTTAGA CAAAGAAGGT AAAACAAAGC ACATCGAACG TATATTGATA 4320 

25 AATGATG AC C ATCCATTTTC AACATTTTAA AATTACGTAA GTAAACATTC GAATTGGACC 4380 

CTATCGTCCA TTAGTATGAA TTTAATATAG TACCACTGTT TACATAGTAA ATCGGTGGTT 4440 

CTTTTTGTTA TCATTTAATA TGAAATATAT CCATAGGAGG CATATAACTA TGAAACCACA 4500 

30 ATTATCGTGG AAAGTTGGCG GTCAACAAGG CGAAGGTATT GAATCAACTG GGGAAATCTT 4 560 

CGCTACGGCT ATGAATAGAA AAGGATATTA TTTATATGGA TATAGACATT TTTCAAGTCG 4 620 

TATCAAAGGT GGACATACGA ATAATAAAAT TAGAGTTTCT ACGACGCCTG TTCATGCAAT 4680 

35 TAGTGATGAT TTAGATATTT TGATTGCATT TGACCAAGAA ACAATTGATG TTAACCATCA 4740 

TGAAATGAGA GAAGACAGTA TTATTTTArC TGATGCCAAG GCTAAACCTG TGAAaCCAGA 4800 

AGGATGTCAT GCACAGCTTA TTGAATTACC TTTTACAGCA ACCGCTAAAG AATTAGGTAC 4860 

40 

AGCATTAATG AAAAACATGG TTGCAATAGG TGCTACTAGC GCATTGATGA ATTTGAATAC 4920 

AAATACATTT GAAGAACTTA TTACTAATAT GTTTTCTAAA AAAGGTGACA AGGTAGTTGA 4 980 

AGTCAATATC CAAGCATTAA ACGAAGGTTA TCAATTAATG CAATCTCGCT TACCTGAAAT 504 0 

45 

CTACGGGGAC TTTGAATTAG AGTCAACAGA TGCACTACCA CATCTATATA TGATTGGTAA 5100 

CGATG CCATT GGATTAGGTG CAATTGCTGC AGGTTCACAA TTTATGGCGG CATATCCTAT 5160 

TACACCTGCG TCTGAAGTTA TGGAATATAT GATTGCCAAT ATATCTAAAG TAAACGGAGC 5220 

SO 

GGTTATTCAA ACAGAAGATG AAATTGCTGC TGTAACTATG GCTATTGGTG CAAATTATGG 5280 
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TGGATTATCT GGTATGACTG AAACGCCATT AGTCATTATT AATACCCAAC GAGGTGGACC 54 00 

TTCTACTGGA TTACCTACGA AACAAGAACA GTCAGATTTA ATGCAAATGA TTTATGGTAC 5460 

5 

ACATGGTGAT ATTCCAAAAA TTGTTGTAGC ACCAACAGAT GCAGAAGATG CATTTTATTT 5520 

AACTATGGAA GCATTTAATT TAGCAGAACA ATATCAATGC CCTGTTATAG TTCTAAGTGA 55 80 

w TTTGCAATTA TCTTTAGGTA AACAAACTGT TGAAAAATTA GATTATAATC GTATTGAAAT 5640 

TAAACGTGGT GAAATCATTC AATCTGATAT TGAACGTGAA GAAGATGATA AAGGTTATTT 5700 

CAAGCGTTAT GCGTtAACAT CCGATGGTGT TTCTC CTAGA CCTATCCCCG GTGTTAAAGG 5760 

15 AGGT ATT CAT CATATAACTG GTGTGGAaCa CAATGAAGAA GGTAAACCTA GTGAATCTGC 5820 

GTCAAATAGA CAACAACAAA TGGAAAAACG AATGCGTAAA ATTGAGCAGT TACTAATTGA 5880 

ATCGCCAGTA GAAGCTAACT TACAACATGA GGATGCAGAT ATTCTTTATA TCGGTTTTAT 5940 

20 

TTCTACAAAA GGTGCAATTC AAGAAGGTAG TAACCGTTTG AATCAACAAG GCATAAAAGT 6 000 

TAACACTATA CAAATTAGAC AATTGCATCC ATTCCCAACA AGCGTTATTC AAGATGCAGT 6060 

25 TAATAAAGCG AAGAAAGTCG TTGTAGTGGA GCACAATTAT CAAGGACAAT TGGCTAGTAT 6120 

TATAAAAATG AATGTCAATA TTCATGATAA GATTGAAAAT TATACAAAGT ATGATGGGAC 6180 

ACCTTTCCTA CCACATGAAA TCGAAGAAAA AGGCAAAATA ATTGCTACTG AAATAAAGGA 6240 

30 GATGGTATAG ATGGCGACAT TTAAAGATTT TAGAAATAAT GTTAAGCCTA ACTGGTGCCC 6 300 

CGGATGTGGC GATTTCTCAG TACAAGCTGC AATTCAAAAA GCAGCCGCAA ATATAGGGTT 6360 

AGAACCTGAA GAAGTAGCTA TCATCACCGG TATAGGATGT TCTGGCCGTC TTTCAGGATA 6420 

35 

TATTAATTCT TATGGCGTTC ATTCTATTCA CGGACGTGCA TTACCTTTAG CTCAAGGTGT 6480 

AAAAATGGCG AATAAAGATT TAACTGTTAT TGCATCGGGA GGAGATGGTG ATGGTTATGC 6540 

TATAGGTATG GGGGATACAA TCCATGCTTT AAGAAGAAAT ATGAACATGA CGTATATAGT 6600 

40 

CATGGATAAT CAAATTTATG GTTTGACAAA GGGACAAACA TCGCCGTCAT CAGCAGTAGG 6660 

ATTTGTTACT AAAACAACGC CAAAAGGTAA TATAGAAAAA AATGTTGCGC CTTTAGAATT 6720 

45 AGTATTATCA TCTGGTGCCA CATTTGTAGC CCAAGGTTTT TCAAGCGATA TTAAAGGATT 6780 

AACAAAACTA ATTGAAGATG CAATTAATCA TGATGGATTT TCATTCGTTA ATGTCTTTTC 6 840 

ACCATGTGTG ACTTATAATA AAATTAACAC ATACGATTGG TTTaAAGAAC ATTTAACAAG 6900 

S ° TGTTGATGAc ATTGAAAATT ATGATTCTAC AGATAAACAA TTAGCGACTA AAACTGTTAT 6960 

TGAACATGAA TCTTTAGTAA CTGGTATTGT TTATCaAGAT AAAGAAACAC CATCATATGA 7020 

ATCtCAAATT AAAGAGTTAG ATGATmCACC ACTTGCTAAA AGAGATATCa AAATTaCTGA 7080 
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TGTATTTATA ACAGATCCAT TTATGCTACT CAGTTTTTTA CTATTACAAA AAATAAAGGA 72 0 0 

GTTTTTAAAA ATGAAAGACA CATTAATGAG TATACAAATA ATTCCTAAAA CACCAAACAA 7 260 

TGACAATGTT ATACCTTACG TAGACGAGGC GATTAAAATA ATTGACGAAT CTG G TTTGCA 7 320 

TTTTAGAGTA GGTCCGTTAG AAACGACAGT ACAAGGAAAT ATGAATGAAT GTTTAATTTT 73 80 

AATACAATCA TTAAATGAAC GAATGGTGGA ACTTGAATGT CCAAGTATTA TTAGCCAAGT 744 0 

TAAGTTTTAT CATGTGCCAG ATGGCATCAC TATTGAAACT TTAACTGAAA AATATGATGA 7500 

ATAACATTAA AAGTGAAGTA AACTGGATTT GAATTGGCTT GTTAGAGATG ACGTATAACT 7 56 0 

75 TTAACTGTTT TTGCACTTTA TAGTTAAATT TAATATAATT ATTAAATGAT ACGGGCAAAT 7620 

AGAAAGGATT TTGTAAAGTG AACGAAGAAC AAAGAAAAGC AAGTTCTGTA GATGTTTTAG 7680 

CTGAGAGAGA TAAGAAAGCA GAAAAAGATT ATAGTAAATA TTTTGAACAT GTTTATCAGC 774 0 

CGCCTAATTT AAAAGCAAGC GCAAAAAAAG AGGTnAAA 7 77 8 
(2) INFORMATION FOR SEQ ID NO: 49: 



20 



25 



30 



35 



40 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1128 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

AGATGAAGTT GTTACgAAAA TTGCGTACGC TGTTTCAGAA CATGTCAAAA TAGAAACAGG .60 

TAATCCATTC TTTCAAACAT CACATAGTGG TTGTGCGACG GGCGGATCCT GTAATTGTTC 120 

ATTATAAAAA ACATCGAGTC AGAAAAAGGT GGTTATTGAA cCACTAACTA GCATCTGACT 180 

CGATGTTTTT ATTTATTCGG GATTGTTTGT TTGAATTGTT GTGCTAAATC TGGTCGATCT 240 

GTCACAATCG TGTGTGCACC TTTTTGGTAT AAATCATTCA TCAGATTTAT ACTATTTACG 300 

CCATAATAGC CTGGAATGAT ATTCATATCA TTTAACCATT TGATAAAACG AGATGAAGTC 360 

45 AAATCAATGC CTTTAAAATG AGTAGGCATT TGG AACGTTT GTGCTAATGG TTGGTAGTAC 420 

CTACCACCTA ATAAATGATA TTTTAAAAAT GCTTCTGTAA CTTCCTGTTG GCTAGCACCA 4 80 

ATTGCGACGG ATCCTTGTGC AATTTTATTA AAACGAACGA TTT G TT CT TT ATAAAAACTT 540 

50 GTCACAAGAA CGCGGTCAAA TGCTTGATTT TCTGCAATTG TATCAAACAT AATTTGTGGT 600 

GCGATTGAGC CTTCATAGGA TTCAGGAGCA TCTTTTAAGT CTACGTTTAT ATACATATCA 660 

GGATATTGCT TCAGCAACTc ATCGAAGGTT AGTATAGCTG TGTGTGCATG ACCACGATAT 720 
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AATGTATGGG CACTAACTTT TCCAGAGCCG TTCGTCGTTC TATCAACAGT TGCGTCATGA 84 0 

AAAACGATAA GCTGTTGATC TTTTGTGAGT CTCACATCTG TTTCAAAGCC ATCAACGCCT 900 

5 

AATTGTTTAG CATAGTCAAA TGCAAGTTGC GTTTGCTCTG GTCTTAAAGC CATACCACCG 960 

CGATGCGCAA ATATATATGG TGCATTGCCT TTGAAAAAAG CAGGGATGGT TTGCTTTTTA 1020 

w GTAATCACTT TATTTTTATT GATCATTAAT AGACTACTTA AAAATCCAGC AC CG ACT AGT 1080 

ACCGCATTTA AAATGTTTCT GTTTACnTTT TTCATAAAAA ATTCCTCC 1128 
(2) INFORMATION FOR SEQ ID NO: 50: 

75 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6252 base pairs 

(B) TYPE: nucleic acid 
<C) STRANDEDNESS : double 

<D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

CAAGCAAACA ATCGTCGATA AAATTGCTAA AATAATAAAA GTAATTCGAA CTTTCATCAT 60 

25 

GATCATCCTT TGTTTATAGA GTCAATATAA GTATGGAATA TGTTAGGTAT ATAGTCAAAT 120 

GCGTCAACTA ATGGGAATTT TGGCATAGAT AGAGAATTTA AGGCAATTAA AAAGGCATCA 180 

30 AACAGTAATA TGCTGCTTGA TGCCCAAATG ATGACTTTAG CTAAATTGAT TAGTCACTTT 24 0 

TAAAGATAAA GAATTGTCAT GAATTAAAAC TCATGTAATG ATGTGTTACA TTTCGCAATG 300 

ATGGCTTTCA GTTATTTATC GATAACATCA CTCTTGATAC CTTTAGATTT TAAGAAATCT 360 

35 TTAATTTTAT CTTGTTGCTT TTTATTAACA TCACCGGCAT ATTTTGTTGG CACGTCGACA 420 

ACATTGATTT TATTTTGCGG TTGATAGCTA AGCTTTTCAA TATCTTCATC AACATTGGCG 4 80 

ATTGTACTAT TTAAAG C TTT GAAGTAATTC ATCATTAATT CAACGGGTTT CTTATATTCT 540 

40 

TTAGGAATAT TGTTTTCAGT GACAAATTTC TTGAAATGCA AATCGTTTTT AACAGCTAAG 600 

TTAGATAAGT GGCTAAGTGT TTCTGCTTGT TTTTCAGTCA CTTTTGTTTG ACTGTCAATT 660 

45 TGTTTATCTA GTTTATGTTG CATAATATAT TTGTTATCAA GTATATCGCT ATTTACAGAC 720 

AAATACTTTT CTATAGCTTG CTTCATCTCT GCATCACTAA TATCACTATT TTTCTTATCT 780 

GAGTTAAAGA TATCTTTTGT tTCTAATTTT TTAGCGCTTT TAGGTGCATG GATGCCAGTA 840 

so 

CTTGTATGAT GATCTTCGTT ATCAGATTGA TCGGACGCGC AACCTGTAAG AATTAATGTC 900 

GATGCTAAAA ATGTACTTAG TAGTAATCTC TTTTTCATAA TGTAATATAA CTCCTTAGTT 960 

TATCTTTAAT TGAAAAAATA TGTATTCATG TTTAATAGAG TAACATTGAA TTAGTTTGGA 1020 
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TCTATCAATA ATGCATCATT TTGGACGTTG TTAAGGATAG CTTTATCTAT AAATAACTGC 
ATAATTGGTT GTACTAATTT AGACGTAGGT ATCGTACGTA AAAGCATAAT AATTTCGTTC 
ACATACTTTT CTTTCTCAAT ATCATTTTTC ATATTGATTT GTTTGCGAGA GGTACATACT 
TTAAGCATTA TCGCACATCT CGTTGTATAT ATTAAGTTTA TCATAACATG ATTTTATGTC 
GGGATAAAAA AATAACAGCA TCTTAACAAA TGTAAGATAC TGTCAGTGAA ATGAATGAAA 



1140 
1200 
1260 
1320 



1440 
1500 



CTTTAGTTTC TGaTAATATA GTCAAAGGCA TTTAATGCTG CATTTGCACC AGCGCCCATT 
GAAATGATAA TTTGTTTGTT CTTCTGATCT GTGACATCGC CAGCAGCAAA TATTCCAGGA 

'5 ACATTCGTAT TATTGTTACG ATCAATCACA ATTTCACCAC GTTCGTTTAA TTCAACAGCA 1560 

TCGTTTAACC ATGATGTGTT TGGAAGTAAA CCAATTTGAA CAAAGATACC ATCTAAGTTA 1620 

AGTAGATGTT CTTCGCCGGT GTTCATGTCT TCGTAACGTA TACCTGTAAC ATGGTCTTCT 1680 

20 

CCGACAACTT CAGTAGTTTT GGCATTTGTT TTGATATCAA CATTTGATAA AGAACGTAAA 174 0 

CGATCTTGTA ACACGTTGTC TGCTTTTAAT TCGCTAGCGA ATTCGAATAA TGTAACATGA 1800 

ss TTAACGATAC CAGCAAGGTC AATTGCTGCT TCAACCCCAG AGTTACCGCC ACCGATAACT 1860 

GCTACGTCTT TATTTTCAAA TAGAGGTCCG TCACAGTGAG GGCAGAATGC AACACCTTTA 1920 

TTAATCAATT GCTCTTCACC TGGAATGTTT AGCTTACGCC AACCTGCACC AGTAGCAATA 1980 

so ATGACTGTTT TACTTTCTAA GACAGCACCG TTTTCTAACG TAACTTTAAT TGCTTCGTCA 2040 

GTCTTTTCGA TATCTGTAGC ACGTATACCT GTCATTGCAT CAATGTCATA TTGATCAATG 2100 

TGCGCTGCTA AGTTAGAAGA AAATTCAGAA CCAGTTGTTT CTTTAACAGT AATGAAGTTC 2160 

55 TCAATACCAG CAGTATCATT AACTTGGCCA CCGATACGAT CAGCAACTAT ACCAGTACGt 2220 

AAACCTTTAC GTGCTGTGTA AATCGCTGCA CTACCACTAG CAGGACCACC ACCAACGATT 2280 

40 AAGACATCAT AAGGTTCTTT ATTTTCAAAC TCAGATGCAT CTGCCGTACT GCCTAGTTTC 2340 

GAAAGAATAT CTTGGATTGT CATACGACCA TTGCCAAATT CTTCGCCATT TAAAAAGACA 2400 

GCAGGGACTG CCATGATGTT TTCAGATTCT TCACGGAACA CTGCACCATC AATCATAGAA 2460 

45 TGCGTGATGT TAGGGTTGAT CACACTCATT AAGTTAAGTG CTTGAACGAC ATCAGGACAT 2520 

TTTTGACACG TTAAACTAAT GAATGTTTCA AAATGGAATG AACCTTCTAA TTTTTTAATT 2580 

TGGTCAATGA TTGACTGTTT TTCTTTAGGT GCACGACCAC TAACCTGTAA AATTGCTAAA 2640 

so 

ACAAGTGAGT TAAACTCGTG ACCTAATGGA ATACCTGCAA ATGTTACACC TGTTTCTTCG 2700 

CCAGGACGAT TGACTGAGAA ACTTGGTGTA CGTTTTAAAG ATTTTTCAGA AAGAGATAGT 2760 

CTAGGTGACA TATCAGTAAT TTCTGTCAAC AAATCTTTAA GTTCTTTGGA TTTATCATCT 2820 
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TGTTGTTTTA AATCAGCATT AAGCATGGTT GTAATGCCTC CTT AGATTTT ACCTACTAAA 2 940 

TCTAAACCAG GTTGCAATGT TTTAGCGCCT TCTTCCCATT TAGCTGGGCA TACTTCGCCA 3 000 

GGGTTTTTAC GAACATATTG AGCTGCTTTG ATTTTGTGAG CTAATGTACT AGCGTCACGG 3 060 

CCAATTCCGT CAGCGTTAAT TTCAGATGCT TGTACAACAC CGTCTGGGTC GATAATGAAT 3120 

GTACCACGTT GAGCTAAACC AGTAGCTTCA TCTAATACAT CAAAATTACG AGTGATTGTT 3180 

TGTGATGGGT CACCAATCAT AGTGTAAGTG ATTTTGCTAA TTGCATCTGA ATGGTCATGC 324 0 

CATGCTTTGT GTACGAAGTG AGTATCAGTT GATACTGAGA ATACATTTAC GCCTAATTTT 3300 

15 TGTAATTCTT CATATTGGTT TTGTAAGTCT TCTAATTCAG TTGGACAAAC GAATGAGAAG 3360 

TCAGCAGGAT AGAAGCATAC TACGCTCCAA GAACCTTTTA AATCTTCTTG TGTAACTTCT 34 20 

TTAAATTGAT CTTTTTTTGG ATCGAAArCT TGCGCTGTAA ATGGTAAGAT TTCTTTGTTA 34 80 

20 

ATTAATGACA TAAATATCTT CCTCCTAAGA ATTTAAGTAT GAATTAGAAC TATCAATTGA 3 540 

TTGCGCTTAA TTATAATAAT TCTAATCTCT TAGTTAGCAT TATTACATTT TGATCCAGAA 3600 

TAGTCAACTG GATAACTTTG TAAAGTGAAT GATTACTTTT AAAATAAAGA AAGATAATAT 3660 

25 

AAAGTGCTTT GATAATGGAT TTTGTAGTTG ATGATTTAAA AGGTTGTGTC TATATTTAAT 3720 

ATCTTGATTT TAATGTAAAA AATGTAAAAA AAGAAGATTT GTATTCTCAA CTAAGTCAAC 3780 

30 CTTATTGATA ATGGTATGAG AATATTTGTT CGAGATGGAT GAAGGTAATG AGTGAGAAAC 3 840 

TGGATTTTTA AAGTATGAGA CAATATTTTA AAAAGTTCAA TTATTAACTT ATAAGCAAAT 3 900 

AATTGCTATA AAAAAGTTTG GACGTGTACA ATTGCAATAT GAAGATTTTA AATTAATTGT 3960 

AAAGTATCGA GGAGTGGGTA ACGTGTCAGA ACATGTATAT AATCTTGTGA AAAAGCATCA 4020 

TTCTGTTAGA AAATTTAAGA ATAAACCTTT AAGTGAAGAC GTTGTTAAGA AATTGGTAGA 4 080 

AGCTGGACAA AGCGCTTCGA CGTCAAGTTT CCTGCAAGCA TACTCAATTA TTGGTATCGA 4140 

CGATGAGAAG ATTAAAGAAA ATTTACGAGA AGTTTCTGGA CAACCTTATG TTGTAGAAAA 4200 

TGGCTATTTA TTCGTCTTTG TTATTGATTA TTATCGTCAT CATTTAGTTG ATCAACATGC 4 260 

45 TGAAACTGAT ATGGAAAATG CATATGGTTC AACGGAAGGT TTGCTAGTAG GTGCAATCGA 4 3 20 

TGCAGCATTA GTTGCCGAAA ATATTGCGGT AACTGCTGAA GATATGGGGT ATGGCATTGT 4 3 80 

CTTTTTAGGA TCATTAAGAA ATGATGTTGA ACGCGTTCGA GAAATTTTAG ACTTAC CTG A 4 44 0 

50 CTATGTCTTC CCGGTATTTG GTATGGCAGT AGGGGAACCc GCAGATGACG AAAATGGTGC 4 500 

AGCCAAGCCA CGCTTACCAT TTGACCATGT CTTCCATCAT AATAAGTATC ATGCTGATAA 4 560 

GGAAACACAG TATGCACAAA TGGCAGATTA CGACCAGACA ATCAGCGAGT ACTATGATCA 4620 
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CAAAGCAAGA TTAGATATGT TAGAACAATT GCAAAAATCA GGCTTAATAC AGCGATAgCA 4 74 0 

AGATACCAAA ATAACCCGCC CCCCTCTAGC TTAAAATGAT AAGTATAGCT AGAGGGGGCG 4 800 

GGTATTTCTT GCAATGAATT AGTGTGAAGT TAATGCAGCA TTATCATTTG AATCGAAAGT 4860 

ATCTTTATCC CAATGTTTAG TTAACTTGGC GGTACCTGTA CCAGCTAGCA TTGAATCGTT 4 92 0 

CACGTTTAAT GCTGTTCTAC CCATGTCAAT CAATGGTTCA ACGGAGATGA GCACGCCGGc 4 980 

TAAAGCGACT GGCAAGTTTA ACGTTGACAA CACCAATATG GATGCAAATG TAGCCCCGCC 504 0 

ACCGACGCCA GCAACGCCGA ATGAACTAAT AATCACGACA GCGATTAACG TTACAATAAA 5100 

is TTGTAAATCA ATTTCTACAT TAGCGACGGG TGCGACCATA ATTGCAAGCA TGGCAGGGTA 5160 

AATGCCTGCA CAACCATTTT GTCCAATCGA CAATCCAAAT GTCGCAGCGA AATTGGCAAT 522 0 

ACCTTCTGGC ACGCCTAGAC GTCTTGTTTG TGTTTGTACA TTCAATGGTA AGGCACCOGC 5280 

GCTTGAGCGT GATGTGAATG CAAAGATTAA TACTTCCAAA GTCTTTTTAA CATAGCGAAT 534 0 

TGGGCTAATA CCTAACAGGC TTAAAATAAT TAAGTGAATG ATATACATCG TAATTAATGC 5400 

AGCGTACGAT GCGATTAAGA ATTTTCCTAA AGTCCAAATG GCGCCAAAGT CACTTGTCGA 5460 

TAATGTGTTG GCCATAATTG CTAATACACC GTATGGCGTT AAACGTAAGA CGAACGTCAC 5520 

AATCGCCATT ACTAGTGAAT AGATAGCGTC AATCGCACGC TTAAGCAATT CACCATGATC 5580 

AGGTTGTTTG CGTnTACGCG TAAATAAGCA AATCCTATAA ACGAAGCAAA TATCACGACA 564 0 

GCAATCGTGG aAGTTGCACG TTGTCCaGTG AAATCTAAGA ATGGATTTTT AGGCAATAAT 57OO 

TCCAAAATTT GTTGTGGTAA CGTATGTGCT GTTAAATCTT TCG C TT G TTT AGCAATTTCG 5760 

35 CTTCCACGTG CTTGTTCAGC GTTACCAAGG TTAATTGTTG ATGCATCTAA ACCAAACACC 5820 

AAGGCATACA CAACACCAAC AATCGCAGCA ATGGTGACAG TGCCAATTAA AAAGATAAAA 5880 

ATGASACTAC CAATTTTAGC AAACTTTTCT CCGATTTGAA TTTTAGTGAA TGCAGCTACA 5940 

ATAGAAATGA AAATTAAAGG CATAACAATC ATTTGCAACA ATGCAACGTA ACCTTGTCCG 6000 

ACAATGTTGA ACCAGTCACT TGTTGATGTA ATAACATTCG AATGTGTGCC ATAAATAAGA 6060 

TGCAATAACA CACCGAATAC TATACCAATC CCTAAAGCTG TAAACACACG TTTCGCAAAA 6120 

GATATATGTT TGCGAGCCAT CATGTGCAAT ATTACGATGA AAATCACCAA TACAATAATA 6180 

TTAATCAGTG TAAGAAAAGC ATTCATGAAC GTCACTCCTT AAATTTTTGA ATATAATTCC 624 0 

50 GACTAGTATG CT 6252 
(2) INFORMATION FOR SEQ ID NO: 51: 
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(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQIDNO: 51: 

ATCAAATCnC AAAATATTTA TTAATnAnAA GGGGATTATC CaTGTgAGAA ACAAAGTAAT 6 0 

GCTCTTTTTT TACCTCTTGT GGGTTGAAAA a TGGATCATC AGAGATAGAC TTCTTCTTTT 120 

TCGAAGATGA CATTTGATAC TTTAATCTTC TAAAACCATA ACTTGTCGCA TCAAAAATGC 180 

CTTCTTGTAC AAGTAAAATC AAAAATATGC TAATAAAAAT AATTAATGAA ACATAAAACA 24 0 

15 ATATATTTAA ATATGTAATG ATAGTATGGC TATTAAAAAG CCATATAATA AACGTTAATA 3 00 

TTGGCGTTAT TAGTGCCATT CCAAGCCATT TTTTCAACAT TTGATCACTC CCACTTATAG 360 

AAAACTCTTA CGCATAGTTT ACATTAAAAT CAGACATTGA GGAATGATTT TTTAATTTCT 420 

TCAG CTTTAT TGAAATTCTA AAATCAATCA TTCTTCATTA GTTTAAAGCA AAAAAATATT 4 80 

GATATATAGT AAATATTGTA TATATAATAT TAGTTAAGAT TTCaGAAAAT TTTGAAGGGA 540 

ATGGAAATTT AGAAATCGGA ATTTGTTAGA GGAGGGGATT AGATGGGGAA ATATATTTTC 600 

AAACGATTTA TTTATATGCT TATTT C TTTA TTTATTATTA TTACAATTAC ATTTTTCTTA 66 0 

ATGAAATTAA TGCCAGGTTC GCCATTTAAC GATGCTAAAT TAAATGCTGA ACAAAAAGAA 720 

30 ATTTTAAATG AAAAATATGG ATTAAATGAT CCTGCAGCTA CGCAgTATTT ACATTATTTA 780 

AAAAATGTTG TTACAGGCGA TTTTGGTAAT TCATTCCAGT ATCATAATO\ ACCTGTGTGG 84 0 

GATTTGATTA AACCGAGACT ACTACCTTCT TTTGAAATGG GTCTTACAGC AATGTTCaTC 900 

GGTGTGATAC TGGGACTTAT TTTAGGTGTT GCAGCAGCTA CTAAACAAAA TTCTTGGGTT 960 

GACTATACAA CTACAGTTAT TTCAGTTATT GCAGTATCTG TACCATCTTT TGTACTTGCT 1020 

GTACTTTTAC AATATGTATT TGCAGTTAAA TTAAGATGGT TCCCAGTAGC TGGATGGGAA 1080 

GGTTTTTCGA CCGCGGTATT ACCGTCACTT GCATTATCTG CAGCTGTTTT AGCAACTGTC 1140 

GCCAGATACA TAAGAGCAGA GATGATAGAG GTATTAAGTT CAGACTATAT TTTATTAGCG 1200 

45 AGAGCTAAAG GTAATTCGAC AATGCGTGTA CTTTTTGGAC ATGCACTTAG AAATGCTTTA 1260 

ATTCCAATTA TTACAATTAT CGTTCCCATG TTAGCAAGTA TTTTAACAGG CACTTTAACA 1320 

ATTGAAAATA TTTTTGGAGT TCCTGGATTA GGGGATCAAT TCGTACGTTC AATTACAACA 13 80 

50 AATGATTTCT CAGTAATCAT GGCAATCACA CTATTATTTA GCACACTGTT TATCGTTTCT 1440 

ATTTTTATTG TAGATATTTT GTACGGTGTG ATAGATCCAC GAATTCGTGT TCcAAGgAGG 1500 

TAAAAAATAA TGGCTGAAAA TAAAAACAAT TTGTCGATTA ACGACGATCA TTCTAATGCA 1560 
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TGAATCAGGA ACCTGAAATG CAACGAGAAA GCAAAAACTT TTGGCAAGAT GCTTGGGCTC 1680 

AGTTAAAACG AAATAAGTTA GCTGTTGTCG GTATGATAGG TTTAATTATC ATTGTAATAT 1740 

5 

TTGCTTTTAT CGGTCCAGTT ATAAATAAAC ATGATTATGC TGAACAAAAT GTAGAACATA 1800 

GAAATCTTCC GGCAAAAATA CCTGTATTAG ACAAAGTTCC ATTTTTACCT TTTGATGGTA 1860 

w AAGATGCAGA TGGCAAGGAT GCTTATAAAG CAGCAAATG C TAAAGAAAAT TATTGGTTTG 1920 

GTACTGATCA GTTGGGTCGA GATTTATGGA CAAGAACATG GAAAGGTGCT CAAATTTCAT 1980 

TGTTTATCGG TGTTGTTGCA GCGATGTTAG ATATTTTTAT TGGTGTTGTA TATGGTG CG A 204 0 

15 TTTCTGGATT CTTCGGTGGA CGTGTCGATA CGATTATGCA ACGTATACTT GAAGTCATAG 2100 

CATCTATTCC GAATTTAATT GTCGTAATTT TATTTGTATT AATTTTTGAA CCATCCATTT 2160 

GGACAATTAT ATTGGCTATG TCTATCACAG GCTGGTTAGG CATGAGCAGA GTTGTACGTG 2220 

20 

GAGAATTTTT AAAATTAAAA AATCAAGAGT TTGTCATGGC TTCGAAAACA TTGGGGGCTT 2 280 

CAAAATTCAA ATTGATATTT AAGCATATTT TACCTAATAC ATTAGGTGCT ATCGTGGTTA 234 0 

CATCAATGTT TACAGTACCT AGTGCTATTT TCTTCGAAGC ATTTTTAAGT TTCATTGGTA 24 00 

25 

TAGGTGTACC CGCACCTCAA ACATCGTTAG GGTCATTAGT AAATGATGGG CGCGCAATGT 24 60 

TATTAATTTA TCCACATGAA TTATTTATAC CAGCAATGAT TTTAAGTTTA TTAATTCTAT 2520 

30 TCTTTTACTT ATTTAGTGAT GGATTACGTG ATGCATTTGA TCCGAAAATG CGTAAATAAA 2580 

AAGGGGGCAT AGCATATGAC TGAAAGAATA TTAGAAGTAA ATGATTTGCA TGTTTCCTTT 264 0 

GATATTACAG CAGGGGAAGT GCAGGCAGTG AGAGGCGTAG ATTTTTATTT GAACAAAGGG 2700 

35 GAAACATTGG CAATTGTTGG TGAATCAGGT TCAGGTAAAT CTGTAACAAC AAAAGCAATT 2760 

ACAAAATTAT TCCAAGGGGA CACAGGAAGA ATTAAAAAGG GAGAAATTTT ATTTTTAGGG 2820 

GAAGATTTAG CAAAAAAACC TGAAAATGAG TTGATTAAAT TACGTGGCAA AGATATTTCA 2880 

40 

ATGATCTTTC AAGATCCAAT GACATCTTTA AACCCAACGA TGCAAATTGG TAAACAAGTC 2940 

ATGGAACCAT TAATTAAGCA CAAAAATTAT AGTAAAGCAC AAGCTAAAAA GCGCGCATTG 3000 

4S GAAATACTAA ATCTTGTAGG TTTACCAAAT GCAGAAAAAA GATTTAAAGC ATATCCTCAT 3060 

CAATTTTCAG GTGGACAAAG GCAAAGAATT GTTATTGCAA CCGCATTAGC TTGTGAACCT 3120 

AAAGTGCTCA TTGCTGATGA ACCAACGACT GCATTAGACG TAACGATGCA GGCACAAATT 3180 

50 TTAGATTTAA TGAAAGAACT ACAACAAAAA ATCGATACAG CAATTATTTT TATAACGCAT 3240 

GATTTAGGGG TTGTTGCGAA TATTGCTGAT AGAGTGGCAG TTATGTATGG TGGTCAAATG 3 300 

GTTGAAACAG GAGATGTTAA CGAAATATTT TATGATCCAA AGCATCCATA TACATGGGGA 3360 
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GGAGCGCCAC CTGATTTATT ACACCCACCT AAAGGTGATG CATTTGCGAG ACGTAGcAAT 34 SO 

ATGCATTAGA TATTGATTTT AAAGTAGAAC CACCGTGGTT TAAAGTTTCA CCGACACATT 354 0 

TTGTGAAATC TTGGTTATTA GACGCACGTG CACCAAAAGT TGAACTACCC GAGCTGGTAA 3600 

AACAACGTAT GAAACCGATG CCTAATAATT ATGAAAAACC ACTCAAGGTA GAAAGGGTGT 3660 

CGTTCAATGA AAAATGATGA AGTGCTATTA TCTATTAAAA ATTTAAAGCA ATATTTTAAC 3720 

GCAGGAAAGA AAAACGAAGT GgaGCGATTG AAAATATTTC GTTTGATATA TACAAAGGGG 3780 

AAACATTAGG TTTAGTAGGA GAATCGGGGT GTGGTAAATC TACAACTGGT AAATCAATTA 3 84 0 

TTAAACTTAA TGATATTACA AGTGGAGAAA TTTTGTATGA GGGTATTGAT ATACAAAAGA 3 900 

TTCGTAAACG TAAAGATTTG CTTAAATTTA ATAAAAAGAT ACAGATGATT TTTCAAGACC 3960 

CATATGCGTC TTTAAATCCT AGGTTAAAAG TAATGGATAT AGTAGCTGAA GGTATTGATA 4 020 

20 TCCATCATTT AGGAACTGaT AAGCGTGACC GAAAAAAACG TGTCTATGaT TTACTTGaAA 4080 

CTGTTGGATT AAGTAAAGAA CATGCCAATC GCTATCCTCA TGAATTTTCA GGTGGaCAAC 414 0 

GCCAACGTAT TGGaATTGCC CGTGcATTAG CCGTTGaACC AGAATTCATT ATCGCGGACG 4 200 

AACCAATATC GGCATTGGAT GTTTCAATCC AAGCTCAAGT AGTTAATTTA TTATTAAAAT 4 260 

TACAACGTGA AAGAGGGATT ACGTTCCTAT TTATAGCTCA TGATCTATCA ATGGTGAAGT 4320 
ATATTTCAGA TCGTATTGCA GTCATGCATT TTGGGAAAAT AGTTGAAATT GGACCGGCAG 43 80 

AAGAAATTTA TCAAAATCCA TTACACGATT ATACTAAGTC TTTATTATCA GCCATTCCAC 4440 
AACCTGATCC TGAATCAGAA CGCAGTCGCA AACGATTTAG TTATATTGAT GATGAAGCAA 4 500 

35 AT AAT CATTT AAGACAATTA CATGAAATTA GACCGAATCA CTTTGTCTTT AGTACTGAAG 45 60 

AAGAAGCGGC ACAACTACGA GAAAATAAAT TGGTGACACA AAATTAAGGG GAAGGGGGAA 4 620 

ATGfiAATGAC GAGAAAATTT AGAACACTTA TTTTAATTTT GATTGCTACA ATTGCATTAA 4680 
GTGGTTGTCC TAATGACGAT GGTATTTATT CAGATAAAGG TCAAGTATTC AGAAAAATTT 4740 
TGTCATCAGA CTTAACATCC CTTGATACAT CATTAATAAC GGATGAAATA TCTTCTGAAG 4800 
TGACTGCGCA AACATTCGAA GGTTTATACA CATTAGGAAA AGGTGACAAA CCGGTGTTAG 4860 
GTGTTGCGAA AGCTTTTCCT GAAAAGAGTA AAGATGGTAA AACTTTAAAG GTTAAATTAA 4920 
GAAGCGATGC TAAATGGAGC AATGGTGACA AAGTGACTGC ACAAGACTTT GTTTATGCTT 4 980 

SO GGAGAAAAAC AGTTGAC CCT AAAACAGGTT CTGAATTTGC ATACATTATG GGGGACATTA 5040 

AAAATGCGAG TGATATTAGT ACTGGTAAGA AACCTGTAGA GCAATTAGGT ATCAAAGCAT 5100 
TAAATGATGA AACATTACAA ATTGAATTAG AAAAGCCGGT TCCATATATT AATCAATTAT 5160 
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ACGGTACGGC 


AGCTGATAGA 


GCGGTATACA 


ATGGTCCaTT 


TAAAGTTGAT 


GATTGGAAAC 


5280 


5 


AAO Mb A IAA 


AACCTTACTA 


TCTAAAAATC 


AGTATTATTG 


GGATAAAAAG 


AATGTAAAAT 


5340 




I AO A 1 AAAvj X 


/■^ fv ft mm m ry* Ik iv fv 

GAATTATAAA 


GTTATTAAAG 


ACTTACAAGC 


CGGTGCATCA 


TTGTATGATA 


5400 




I OAATQ. AC* r 


AGATGACGCA 


TTTATTACTG 


CAGATCAAGT 


AAATAAATAT 


AAAGACAACA 


5460 


10 


JV IV 1* lion •» JV *V 

AAGGATTAAA 


CTTTGTGTTA 


ACGACTGGGA 


CArrrrrixiT 


AAAAATGAAT 


GAAAAACAAT 


5520 




ATCCTGATTT 


TAAAAACAAA 


AATTTAAGAT 


TGSTATCGCA 


CAAGCAATAG 


ATAAAAAAGG 


5580 




ATACGTTGAT 


TCAGTGAAAA 


ACAATGGCTC 


AATTCCTTCC 


GATACACTAA 


CAGCCAAAGG 


5640 


15 


AATTGCGAAA 


GCGCCTAATG 


GCAAAGATTA 


TGCGAGTACC 


ATGAATTCGC 


CTTTAAAATA 


5700 




T AATC CT AAA 


GAAGCAAGAG 


CACACTGGGA 


CAAAGCTAAA 


AAAGAGTTAG 


GTAAAAATGA 


S760 


20 


AGTGACATTT 


TCAATGAACA 


CAGAAGATAC 


ACCAGATGCA 


AAAATATCTG 


CTGAATATAT 


5820 


CAAATCGCAA 


GTTGAGAAAA 


ATTTACCAGG 


AGTTACTTTG 


AAAATTAAGC 


AATTACCGTT 


5680 




TAAACAAAGA 


GTATCACTAG 


AACTGAGTAA 


CAATTTTGAA 


GCATCACTTA 


GTGGTTGGTC 


5940 


25 


TGCAGATTAC 


CCTGATCCTA 


TGGCTTATTT 


AGAAACAATG 


AC CACAGGTA 


GCGCACAAAA 


6000 




TAATACAGAC 


TGGGGTAATA 


AAGAATATGA 


TCAATTACTT 


AAAGTAGCAA 


GAACCAAATT 


6060 




GGCACTTCAA 


CCGAACGAAC 


GATATGAAAA 


CTTGAAAAAA 


GCAGAAGAAA 


TGTTCCTAGG 


6120 


30 


AGATGCACCG 


GTAGCACCAA 


TTTATCAAAA 


AGGTGTtGCA 


CATTTaACAA 


aTCCTCAAGT 


6180 




AAAAGGATTA 


ATTtACCATA 


AATTTGGTCC 


AAATAACTCA 


CTTAAACATG 


TATATATTGA 


6240 




TAAATCGATA 


GATAAAGAAA 


CAGGTAAGAA 


GAAAAAATAA 


TATGCTTTGT 


AAATTAGGCT 


. 6300 


35 


GGAGACATAT 


CTCCAGTCTT 


TTTGTGTTGG 


ATAAAAaCTT 


TGGGAATAAA 


AATTTAAAAT 


6360 




AAGTCGTTTT 


TTAAATTACT 


GAAATTGATT 


AAATGCATAA 


ATAACTGAAT 


ATTCTAAAAA 


6420 


40 


TAA/CtTTGTA ATAATTTTTT 


CTATGAGTAA 


ACTAAAAAGA 


AAAAATTAGA 


TTGAAAGTAG 


6480 


GAGGCATATG 


TATGGGGAAG 


CTAATTAAAT 


ATATTTCAAT 


ACTTCTTATT 


GTCGTTTTAG 


6540 




TGTTGAGTGC 


TTGCGGAAAA AGCAGTAATA 


AAGATGAAGG 
TAAATGTAGC 


AGTAAAAGAT 


GCTACTAAAA 
CCGCCAAGTG 


6600 
6660 


45 


CGGAAACCTC 


AAAACATAAA 


GGTGGTACCT 


ATTAACAGCA 




GTGTTTATTC 


TTCGTTATTA 


AATAGTACAC 


ATGCAGATTC 


TGTAGTTGAG 


GGATATTTTA 


6720 




ACGAAAGCTT 












6730 


SO 


(2) INFORMATION FOR SEQ ID NO: 52 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6482 base pairs 

(B) TYPE: nucleic acid 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 
AATTTTTGTC ATTATTAAAA ACCTCGCTTT TAAAAGATTG AAAAGTAAAT GAGTGAAATT 60 

AAAGATTATG CACATTAAAA TCACGCCACA ATTTAATTGT GAAAAATATC ACAAATATAT 120 

TATAACACTA AATTTCCCAA AATTCAAAAG TGTGTTTTAT TGCAGAAAAC TTATAACAyG 180 

TGCACAAGTT ATAGTGAATT GCAAACGGAT TACTTTAGTC TTTTTAAAAC ATGAAGTATA 240 

ATTTGTATAG CAATAAATAT AAAAATGGGA GGCTATGTTC AATGAGCAAT ATGAATCAAA 300 

CAATTATGGA TGCATTTCAT TTCAGACATG CGACTAAGCA ATTCGATCCA CAAAAGAAAG 360 

15 TTTCGAAAGA AGATTTTGAA ACAATATTAG AGTCAGGTAG ATTGTCTCCA AGTTCT CTTG 420 

GGTTAGAACC TTGGAAGTTT GTCGTGATTC AAGATCAAGC GTTACGTGAT GAATTAAAAG 480 

CGCACAGTTG GGGCGCAGCA AAACAATTAG ATACAGCGAG CCATTTTGTG CTAATTTTTG S40 

CGCGTAAAAA TGTAACGTCA AGATCACCGT ATGTACAACA TATGTTAAGA GATATTAAAA 600 

AATATGAGGC ACAAACGATT CCAGCTGTTG AACAAAAATT CGATGCATTC CAAGCAGATT 660 

TCCATATTTC TGATAATGAT CAAGCCTTGT ATGACTGGTC AAGTAAACAA ACGTATATTG 720 
CATTAGGCAA TATGATGACG ACAGCCGCAT TGTTAGGTAT TGATTCATGT CCGATGGAAG 780 
GTTTTAGTCT GGATACAGTG ACAGACATTT TAGCAAATAA AGGGATCTTA GATACTGAGC 840 
AATTTGGTTT ATCAGTGATG GTCGCATTTG GCTACAGACA ACAAGAGCCA CCGAAAAATA 900 
AAACACGCCA AGCTTATGAA GATGTTATTG AATGGGTTGG ACCAAAAGAA TAAATAGAAT 960 

ACCGTATGTC TAAATATATA AAATTAAAAA GTTAGCAATA AAAAAGCCTG CGATTACATA 1020 

35 AATGAATCGC AGGcTTTTGC GTGAAAAAAT TGTATTAATA AAGTATGGAT GATTATTTTT 1080 

CTGGSACAAG GTCAGTATTT GAATGAACTG TGATGTCAAA CCCTTCTGGT GCCGTAAATG 1140 

TATGTGTTGA GGCGTCGGGT TGATAAATAT CAACATGTGT TAATCCATAA CTTTGTGAAT 1200 

TGTTTTGTCT TGCTTGATTG GATTGCCAAG TATTAGCAGC AATATGATGG TGATAATGAT 1260 

TCGTTGACAT AAATAGCGCA CGTGGAAAAT CAGACACATG TTGGAATCCT AATTGTTCAA 1320 

TGTAACATTG ATATGCTGCG TCTAAATCAT GTGTTTTTAA ATGTAAGTGT CCAATCATGC 1380 
CTTTTGCTGG CATTCCTTGC CAACCTTCAT CAGTACGATG TGTTAATAAG GTTTGGCTAT 1440 
CAACTTCTAA AGTATCCATT TTAACTTTGC CATTTTGCCA TTCCCATGAA GATGAAGGTC 1500 
50 TATCGCGATA GACTTCAATA CCATTACCTT CGGGGTCGTT GAAATATAAA GCTTCACTTA 1560 
CTAAATGATC ACCAGCGCCG ATGCCCATAT TTTTTTGTGC CACGAAATAT AAGAAGTTAG 1620 
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aAGTCTGACG GcCGTCTTCT AATAAATGTA ACGTTAGAGT ATGGc CACCA 


GTCCCAACAG 


1740 


5 


ATAATACGGT 


TGTATTATCG 


TCAGAACTTT 


TAACGGATAG 


TCCTAAAATG 


TTTTTGTAAA 


1800 




ATGTTGTCAT 


TAAGTCTAAG 


TCTCTTACGT 


TCAGTACAAT 


GTTTGTCACT 


TGTGTTGCTG 


1860 




TTTTATCGTG 


AAATGCCATT 


ATGCATCGCC 


TCTTTTTCTA 


TTTTT CT AT A 


AGTTAGTATA 


1920 


10 


AAAAGTATAC 


CAGAAAAGAA 


AATGAATTGA 


TAGCATAAAG 


TTTGAAATGC 


AAAATAACTA 


1980 




GTCGTTTTGC 


AATTTTAtAT 


TGATGCGAAC 


AAAAAAGCGA 


TGGTACAGTT 


GCACCATCGC 


2040 




AAAATTTATT 


TAACCAAGAT 


AT ACAT CTTG 


ATATGAATCT 


TCTTTTTCTA 


ACATATGTTT 


2100 


15 


GGCAAATGAA 


CATGAGGCAA 


TAATTTTCAA 


ATTATTTTCT 


CGAGCGTGTT 


CAACAACTGc 


2160 




TTTAAGTAGT 


tttttgccaa 


CACCTTGACC 


ACCAAGTTCA 


TCAGATACGC 


CTGTATGATC 


2220 




AATGTTAATT 


TCATTATTAT 


CCACAAAACG 


GTATGTGATT 


TCAGCTAAAG 


CATTATTTTC 


2280 


20 


ATCATCACCA 


ATATAGAATT 


TGTTCTCGCC 


TTGTTTGATT 


TCAAGGTTAC 


TCATACATAT 


2340 




CAACTCCTAT 


CATGATTGAT 


TATAGTATTT 


CCCTATTCTA 


TTTTAACTTA 


AACGAAGTCA 


2400 


25 


AAGGTGCATG 


ACAGTCATGT 


GACGACATTG 


CCACATCTAT 


GTAGTCGTTT 


TTATTAAGCA 


2460 


CAGTTTGAAA 


TGAAGATGAA 


AACACGTATC 


TTGACATTAA 


ATCTATTCAG 


CTATATAATT 


2520 




TATCTCGAAA 


TCGAAATAAA 


ATAAAAAAGT 


TGGTGATCAT 


ATGGATCGAA 


CGAAACAATC 


2580 


30 


TCTCAATGTT 


TTTGTCGGAA 


TGAATAGGGC 


GTTAGACACA 


TTAGAGCAAA 


TTACAAAAGA 


2640 




AGACGTAAAG 


CGATATGGCT 


TAAATATTAC 


TGAATTTGCA 


GTGCTCGAGT 


TGCTTTATAA 


2700. 




TAAAGGTCCG 


CAACCAATTC 


AACGTATTAG 


AGACCGCGTA 


TTAATTGCAA 


GTAGCAGCAT 


2760 


35 


TTCATATGTT 


GTAAGTCAAT 


TAGAGGACAA 


AGGTTGGATT 


ACACGTGAAA 


AGGATAAAGA 


2820 




TGATAAACGT 


GTATATATGG 


citgtttaac 


TGAAAAAGGT 


CAAAGTCAAA 


TGGCAGATAT 


2880 




TTTOCCTAAG 


CATGCTGAGA 


CATTAACAAA 


AG CGTTTGAT 


UTUTTAACAA 


AGGATGAATT 


2940 


40 


AACAATCTTA 


CAACAAGCGT 


TTAAGAAACT 


AAGTGCACAA 


TCTACAGAAG 


TGTAAGGCGT 


3000 




GCACTAAAAA 


TTTACATTAA 


AGTATCTCGA 


TTTCGAGATA 


AATGCACTAA 


AAAXATAAAG 


3 060 


45 


AGGGTATATA 


AAATGATAAA 


TAATCATGAA 


TTACTAGGTA 


TTCACCATGT 


TACTGCAATG 


3120 






TTATAAATTT 


i 1 lACAGAAG 


TACTAGGCAT 


G CGTTTAGTT 


3180 




AAAAAGACAG 


TCAATCAAGA 


TGATATTTAT 


ACGTATCATA 


cniTrnxic 


AGATGATGTA 


3240 


50 


GGTTCGGCAG 


GTACAGACAT 


GACGTTCTTT 


GATTTTCCAA 


ATATTACAAA 


AGGGCAGGCA 


3300 




GGAACAAATT 


CCATTACAAG 


ACCGTCTTTT 


AGAGTGCCTA 


ACGATGACGC 


ATTAACATAT 


3360 




TATGAACAGC 


GCTTTGATGA 


GTTTGGTGTT 


AAACACGAAG 


GTATTCAAGA 


ATTATTTGGT 


3420 
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TTAAATGAAG GGGTAGCACC TGGTGTACCT TGGAAGAATG GACCGGTTCC AGTAGATAAA 354 0 

G CG ATTTATG GATTAGGCCC CATTGAAATT AAAGTAAGTT ATTTTGACGA CTTTAAAAAT 3600 

ATTTTAGAGA CTGTTTACGG TATGACAACT ATTGCGCATG AAGATAATGT CGCATTACTT 3660 

GAAGTTGGCG AAGGAGGCAA TGGTGGCCAG GTAATCTTAA TAAAAGATGA TAAAGGGCCa 3720 

GCaGCACGTC AAGGTTATGG LGAGGTACAT CATGTGTCAT TTCGTGTGAA AGATCATGAT 3780 

GCAATAGAAG CGTGGGCAAC GAAATATAAA GAGGTAGGTA TTAATAACTC AGGCATCGTT 3 84 0 

AATCGTTTCT ATTTTGAAGC ATTATATGCA CGTGTGGGGC ATATTTTAAT AGAAATTTCA 3 900 

ACAGATGGAC CAGGATTTAT GGAAGATGAA CCTTATGAAA CATTAGGCGA AGGGTTATCC 3 960 

TTACCACCAT TTTTAGAAAA TAAAAGAGAA TATATTGAAT CGGAAGTTAG ACCTTTTAAT 4020 

ACGAAGCGTC AACATGGTTA ATTGGAATGA GGAGGATTTG TGATGGAACA TATTTTTAGA 4080 

GAAGGACAAA ATGGTGCGCC AACACTAATA TTATTGCATG GTACAGGTGG TGATGAGTTC 4140 

GATTTATTAC CGTTAGGCGA AgcATTGAAT GAAAATT ATC ACTTGTTAAG TATTAGAGGA 4200 

CAAGTTTCAG AAAATGGGAT GAACCGTTAT TTCAAACGTC TTGGTGAAGG TGTTTATGAT 426 0 

GAAGAAGATT TGGCATTTCG TGGACAAGAA TTGTTGACGT TCATTAAAGA AGCTGCTGaA 4320 

CGTTATGATT TTGaTATTGA AAAAGCAGTA CTTGTTGGAT TTTCAAATGG ATCAAATATA 43 80 

GCGATTAACT TAATGTTGCG TTCAGAAGCA CCATTTAAAA AAGCATTGTT ATATGCACCG 444 0 

TTATACCCAG TTGAAGTAAC GTCAACAAAG GATTTATCAG ATGTCAGTGT GTTGCTTTCT 4 500 

ATGGGGAAAC ATGATCCAAT TGTGCCATTA GCTGCAAGTG AACAAGTCAT TAACTTGTTT 4 560 

AATACACGTG GGGCACAAGT CGAAGAAGTT TGGGTGAAGG GCCATGAAAT TACAGAAACT 4620 

GGATTAACGG CTGGTCAACA AATACTTGGG AAATAACAGT TCTATTAAGA AGCGGACAGA 4680 

TGGAAAAGAT TTTTACTTTT CATCTGCCCG CTTTTTTGAT TTTGAAGTGC TGTACTAAAT 4740 

40 TTTACAATAG TATAGATATT TTAATCGATA TGAGATTTGC CGGTAATACG CTTAATTAAA 4 800 

CCTTTATAGA GTACAGGTAT GAGTAAGATG AAACCGAACA ATCCCATAAT AGGGAATACT 4860 

TTTCCAATTA ATGAAATGAa ACCGATAAAT GTACTAATAT AAGTGATGAC AGC CATTGT A 4 920 

ATAATAATGA TGAAGTAACG TCTGCTGAAT GGAACGCTGA AACGTGACGC AAATGCATAC 4980 
ATTAATCCAA CAACAGTATT GTAGATGACA AGTATCATAA TGACAGACAT AATAATACCA 5040 
ATTGACGGAG ACATTTGTGT CGCTAATTTT AATGTAGGTA GATCTACGTG TTTAATTTTA 5100 
TCGAATTGAG AAATTAAACC TAGATTAATC ATCATGAGTA AAAATGTAAT GATTAAACCG 5160 
CCAATCAAGC CCCCGTATAA CGTTGAGTCA CGATATTTAA CTTTACTACC CATCACTGAT 5220 
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CCAGGTGATA 


ATGATTTCTG 


CTTATGAATC 


TGAGCATCAT 


TATTAGCGGC 


AGTAAAATCA 


5340 


AGATGACTTG 


TTGTGAAATA 


GTAGACCGCA 


ATCATAATGA 


CAATCGCAAT 


TAAAAATGGG 


5400 


GTAACACCGC 


CAAGCACAGC 


AATTAAACGA 


TCGAATTTTA 


GAAACAGTGT 


TGCTAAAATA 


5460 


AAGGCGACTA 


ATATGAGTGC 


GCTCAGCCAA 


TACGGTAAGT 


TGAAACTTTG 


ATGAATGGTT 


5520 


GACGCACCAC 


CTGCAGTCAT 


AATAATAGCT 


AAAGACAACA 


TAAACATTGT 


TAAAATAATA 


5580 


TCAAAACCTC 


TTGCAATAGA 


GGGGTATAAG 


AAATAGTTAA 


TTGAATCAGA 


ATGATTTCTG 


5640 


GACTTTAGAT 


GATGACCTGT 


ATGCATGACA 


ACCATTCCAC 


CTAAAGTAAT 


CAATAGTCCT 


5700 


GTTACAATAA 


TGCCTGAAAT 


GCTATATGCG 


CCATGACTTG 


TGAAAAACTG 


GAAAATTTCT 


5760 


TGACCAGTAG 


CAAAGCCGGC 


ACCAACGACA 


ACACCAACAA 


AGGCAAATGC 


CACAATAATG 


5820 


GACTCTTTTA 


AGATACGCAT 


GATTTAAAAA 


TGTCCCTTCG 


TAATTTTAAG 


TAATATAGAA 


5880 


AATGTAACAT 


ACATGTTAAT 


GAAAAATATA 


GTACTAATAT 


AGTATTTrGT 


TAAATTGGAG 


5940 


TAGAAGCGAG 


GGTGTCGGTC 


ATTTCATTAA 


TTTATTAGTT 


GATTTTGCAT 


TTTTTTGCTG 


6000 


TAAAGTTGTT 


ATAATACAGT 


TAACAGGAAT 


TAGCATAGAT 


ACACCAATCC 


CCTCACTACT 


6060 


CGCAATAGTG 


AGGGGATTTT 


TTTCGGTGTA 


GCTAGGTCGC 


CTATTTATCA 


TCGTGTTTGC 


6120 


GTAgCaATGC 


GTAAACACAG 


TACCACTAAA 


TAAGTGCACG 


ATACATGCAT 


CAAATGTCGT 


6180 


CTTTAGTcTA 


AGTAACGATC 


ATGCATTAAC 


ATTTTCAAAA 


TATCTATTTG 


AGCTTGAAGA 


6240 


TCTTTACCAA 


TATTGGTATC 


ACGAATCTTC 


TTACGTTGTA 


ATTCTTTATC 


TACGACGCGC 


6300 


TTTATAGAAA 


GTTCATCGAT 


ACCTTGUGAA 


AGTATTTTTn 


CTTTAGCGTT 


AAAi'l^ii'iGG 


6360 


TGTGCAACGA 


GTTGCATACC 


GAATGAATTA 


TACAATAGTG 


TATAGCCTGC 


AATGCCAGTn 


6420 


GTTGACTGAT 


AAGCTTTTGA 


AAAGCCACCA 


TCAATGACAA 


GCATCTTTCC 


ATCAGCCTTG 


6480 


AT - 












6482 


(2) INFORMATION FOR SEQ ID NO: 53: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 16592 base pairs 

(B) TYPE : nucleic acid 
<C) STRAND EDNESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 53: 

SO 

ATTTAAGGCG ATTGCTTGTG TATTTCTCTC TTTTGTAGGC AAACCTGCAC TCGTTCCAAA 60 
AAATGTAACT TCCATATATG CCCCTCCTTT TCTTCAATTC ATTTTATCAT AAAATTTGTA 120 

55 



rNSOOCID: <EP 0786519A2J_> 



403 



EP 0 786 519 A2 

AATTTTTCTA ACTTTAACGT AGACATAACT ATATAAATTT TGATAATTAC GTTATACTTA 240 

TCATTAATAA GTATCACATT AAACATGATA CATGAATCGA TATTTCATTT AAGACACTGC 3 00 

5 

ATACAGTCGA GCATATTGTA TGACCTACTG AATGGATTAT CTTATAATAA TAAATCATAT 360 

ATCTAATTAA GAATTGAGGT TTTAATCTTG AGTACTAAAA ACAAACACAT CCCATGTTTA 4 20 

ATCACAATCT TTGGTGCACT GCGTGACTTA AGCCATCGTA AGTnGTTTCC ATCAATATTC 4 80 

10 

CATCTCTACC AACAAGACAA TTTAGATGAA CATATTGCCA TcATCgGTAT TGGACGTCGT 54 0 

GACATkwnTA ATGATGATTT CCGTAATCAA GTAAAATCAT CAATTCAAAA GCACGTAAAA 600 

15 GATACAAACA AAATTGACGC GTTTATGGAA CATGTCTTCT ATCATAGACA TGATGTTAGT 660 

AATGAAGAAA GCTATCAAGA ATTACTAGAT TTTAGTAATG AATTAGATAG CCAATTTGAA 720 

TTAAAAGGTA ATCGACTATT CTATTTAGGA ATGGCACCAC AATTCTTTGG CGTTATTTCT 78 0 

20 GATTATCTAA AATCTTCTGG TCTTACTGAT ACAAAAGGAT TTAAACGCCT TGTTATCGAA 84 0 

AAACCATTCG GTAGTGATTT AAAATCAGCC GAAGCATTAA ACAATCAAAT TCGTAAATCA 900 

TTTAAAGAAG AAGAAATTTA TCGTATTGAC CACTATTTAG GAAAAGACAT GGTTCAAAAT 960 

25 

ATCGAGGTAT TACGTTTTGC GAATGCGATG TTTGAACCAT TATGGAATAA CAAATATATT 1020 

TCAAACATCC AAGTTACATC TTCTGAAATA CTAGGTGTTG AAGATCGTGG TGGTTATTAT 1080 

GAATCAAGTG GCGCGCTAAA AGATATGGTG CAAAACCACA TGTTACAAAT GGTTGcATTA 114 0 

30 

TTAGCTATGG AAGCACCTAT TAGTTTAAAT AGTGAAGATA TCCGTGCTGA GAAAGTAAAA 1200 

GTACTTAAAT CACTGCGTCA TTTCCAATCT GAAGATGTTA AAAAGAACTT TGTTCGTGGT 1260 

35 CAATATGGCG AAGGCTATAT CGATGGTAAA CAAGTTAAAG CATACCGTGA TGAAGATCGC 1320 

GTTGCAGATG ACTCTAACAC ACCTACCTTT GTTTCAGGTA AATTAACAAT TGATAACTTT 1380 

AGATGGGCTG GTGTACCATT CTATATTCGT ACTGGTAAAC GTATGAAATC TAAAACAATT 144 0 

40 CAAGTTGTCG TTGAATTTAA AGAAGTACCA ATGAACTTAT ACTATGgAAA CTGaTAAACT 1500 

GTTAGATTCA AACCTATTAG TAATCAATAT CCAAGCTAAT GAAGGTGgTA TCTTTtACAT 1560 

CtAAATGcTA AGaAAAATAC ACAAGGTATC gAAACAGrAC CTGtCCmATT GtCTTACTCm 1620 

45 

ATGaGCGcTC aAGaTAAAAT GaATACTGTA GATGCATATG AAAATCTATT ATTTGATTGT 16 80 

CTTAAAGGTG ATGCCACTAA CTTCACGCAC TGGGAAGAAT TAAaATCAAC ATGGAAATTT 174 0 

GTTGATGCAA TTCAAGATGA ATGGAATATG GTTGaTCCAG AATTCCCTAA CTATGAATCA 1800 

SO 

GGTACTAATG GTCCATTAGA AAGTGATTTA CTACTTGCTC GTGATGGTAA CCATTGGTGG 1860 

GGACGATATT CAATAATTGA ATTAAAACGC ACATGTTAAA CAAAAATAAA TGAGCGAATG 1920 
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TATATTATGA AATTATATTT TACAATGCCC AAAACTATTT TAATAATCAT TGAACAAATG 204 0 

GGTGTATAAT TTATAGAAAT AATGTAGAAT AAAAATAAAT GATTGAATTA ATTGGAGTGA 2100 

AAGTTTTGGA CGTTATCAAG CAAATACAAC AGGCAATTGT TTATATTGAA GATCGTTTAT 2X60 

TAGAGCCTTT CAATTTGCAA GAATTAAGTG ATTACGTTGG TCTTTCGCCA TACCATCTTG 2220 

ATCAATCATT TAAAATGATT GTCGGCTTAT CTCCAGAAGC TTATGCACGC GCGCGTAAAA 22 8 0 

TGACACTCGC TGCAAATGAT GTGATTAATG GTG CT ACACG ACTTGTAGAT ATCGCTAAAA 234 0 

AATATCACTA TGCAAATTCA AATGATTTTG CAAATGATTT TAGTGATTTT CACGG CGT AT 24 00 

CAC CT ATT CA AGCCTCTACT AAAAAAGATG AATTACAAAT TCAAGAGCGA TTATATATCA 24 60 

AATTATCAAC TACTGAGAGA GCACCTTATC CATACAGATT AGAAGAGACA GATGATATTT 2520 

CATTGGTTGG ATATGCACGA TTTATAGACA CTAAGTATTT GTCACATCCT TTTAATGTTC 2580 

20 CGGATTTTTT AGAAGACTTG CTCATTGATG GTAAAATTAA AGAGTTACGA CGATATAATG 264 0 

ACGTTAGTCC ATTTGAACTA TTTGTTATTA GTTGTCCTCT TGAAAATGGT TTAGAAATAT 2700 

TTGTAGGTGT ACCAAGTGAA CGTTATCCTG CACACTTAGA AAGTCGATTT TTACCTGGCA 2760 

AACATTGTGC GAAATTCAAT TTACAAGGTG AAATTGATTA TGCAACTAAT GAAGCTTGGT 2820 

ACTATATTGA ATCAAGTTTG CAGTTAACAT TGC CATATGA ACGAAATGAT TTATATGTTG 2880 

AAGTGTACCC TCTCGATATT TCATTTAATG ACCCATTCAC TAAAATTCAG CTTTGGATTC 294 0 

CTGTTAAACA GAGTCCTTAT GACGAAGATT AAATAATAAA AAACAAAGAA GCCCCCTAAT 30 00 

ATATCTATAG GTCTACAAAT GGCCTTAGAT TCTATTAGGG GGCATATTAA TATGTTAATT 3060 

TAGTTCGATA ACACATGCTT CATATGGACG TAACTGTTTT AAATTAACTT TGGCATCATA 3120 

ATTAAATAGC TTTACTTCTC CATGGCTTAA ATCAAATGGT ACAGTTAATT CTGCTTCGTG 3180 

GTTAGTAAGA TTACCTACAA TAAGAACTTG CTTTTCATTT AATGTTCTCG TGTACGCAAA 324 0 

40 AACTTGTGAA TTTTCAGCAT CTACTAAATC AAATTGACCA TATACGTATA CATCATTAGA 3 300 

CTTTCTTAAT TGAATTAAAT CTTTATAAAA TTGTAATACT GAATGCTCAT CTTCTAATTG 3360 

TTGTGCAACA TTGATAGTTT TATAATTCGG ATTCACTGGG AACCACGGTT CACCATTTGT 3420 

AAATCCTCCA TTTAACGTAT CATCCCATTG CATTGGTGTG CGAGAATTAT CTCGGTTCTC 34 80 

ATCTTTATAT TTCGCAAGTA AAGCGTCTAC ATCTCCACCT TGAGCTTTCA CTATTTGATA 354 0 

GTCATTTTTA ACAGCAACAT CGTTAAACGT TTCAATACTT TCAAATGGAT AATTCGTCAT 36 00 

ACCAATTTCT TGACCTTGAT AAATGAATGG CGTACCTTGT TGCAAGAAAT AAACAGCTGC 36 60 

ATGACTTGTT GCTGATTCAT ACCAATACTT GTCATCGTCA CCCCACGTCG ATACACGTCG 3720 
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CCATCTATTT AATACAGATT TATACGAATT TACATCAAAG TGAGAATCAC CACTATTCCA 3840 

CAGTCCCAAA TGTTCAAATT GG AATATCAT ATTAAATTTA CCATTTTCTT CCCCGACCCA 3 900 

GTCATCAGCA TCATCAGGGC TTACACCATT CGCTTCACCA ACAGT CAT AA TGTCATACTT 3 960 

ACTTAATGAG CGATCTTTCA TCTCTTGTAA CCAAGTTTGT ATAC CTGGCT GATT CAT AT C 4 020 

TACATCAAAT GCTGGGGCAT ATGTTTTACC CTCAGGTACA GGTAAGTCAC CCGCTTCAAA 4 0 80 

CGTCTT CTTA ATATGCGTAA TTGCATCTAC TCTAAATCCA TCAATGCCTT TATCAAACCA 4140 

CCAGTTCATC ATTTCAAATA CAGCATCTCT AACTTCCGGA TTACCCCAAT TCAAATCAGG 42 00 

TTGTTTTTTA CTGAATAAAT GGAAATAATA TTGCTCAGTA TTAGCATGAT ATTCCCATGT 4260 

AGATCCATTA AATATACTTT CCCAGTTGTT AGGTTCAGAG CCAT CTGGCT TTGGATCTTG 4320 

CCAAATGTAC CAATCACGTT TGGGATTGTC TTTACTAGAT TTGGATTCTA TAAACCAAGG 4380 

20 ATGTTCATCA GATGTATGAT TTACAACTAA ATCTAAAATA AGCTTCATGC CTCTATCATG 444 0 

AACACCTTTT AATAAACGAT CAAAGTCTTC CATCGTTCCA AATTCATCCA TAATCTCTTG 4 500 

GTAGTCACTA ATATCATAAC CATTGTCATC ATTAGGTGAT TTAAACATTG GACTGAGCCA 4 560 

25 AATGACATCG ATACCGAAAT CTTTTAAGTA GTC CAATTTA TCAATCATTC CAGGTAAATC 4 620 

CCCAATACCA TCGTGATTAC TATCATTAAA ACTTCTTGGA TATACTTGAT ATGCTACTGC 4680 

TTCTTTCCAC CATTGCTTAT TCATTTTAAA ACTCCTTTGC TATCGCTGTG TTGATTTTCT 474 0 

TATTTTTAAT TCTGTATCTA TAATGACGAG TTCAATAACA TCCTGTGCTT TGTTTTTCAA 4800 

TATATTTAAA ATTGCTGCAC CAGCCTGTTG ACCTAACATT CGAGGCTTGA TGTCAATACA 4 8 60 

GGTTTGTGGT GGTGACGCAA TTTCGGTTAA ATAAGAATCA TTGAACGTTG CTGTCATTAC 4920 

ATCTTTCGGA ATTTCAATAT TAAGTT CAT A TAGGACACTT AAAATCGCTA AATGTAACAT 4 980 

AGCAT CTAAC GAAATGATTG CCT G TTT A AT ATTTGGGTCC TTCAAACGCG TATGTAGATT 5040 

40 TTGCATGTAA TTAAAAATAA CTTCTCTTTC ATTACTAGTC TCAATAATTT GATAATTAAT 5100 

TTTATTTTGA GAAGCTATCG TTTCAAATCC TTGAATTCTA TCTTTTGAAA CTTCAAAATT 5160 

TCCTTTTTCT GTAATAAATA TTAATTCATC TACACCTTGT TCAATAACAT GTCGTGTCAA 5220 

45 ATTTTCAGAA GCTAATATAT TATCATTATC TATATGTGTA AATTGATGAT CTATATCCGA 5280 

TGTAGGCTTA CCAATCACAA TAAATGGCAT GCTTTCATCA ATTAACATTT GTTTAATCGG 5340 

ATCATTTTCT TTTGAATAGA GCAGTATAAA CGCATCAACC ATTCGTTGTT TAATCATTTT 54 00 

SO 

ATAAACTTCA TCCATTAAAT CATTCATATT ATTTGAGACT GTCGTTTGTG TACCATAGCC 5460 

ATGCTGGTTA CACGTTTCAG AAATTCCTAG CAATACATTG ATGTAGAATG GATTCAGTCG 5520 
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15 



20 



25 



30 
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SO 



AGTTCTAGCA 
AATTGTCGCT 
AACACCGGCT 
CTGCATCGGA 
AGTCAAAAGA 
ATTCAATCCG 
TAACTGCAAA 
CTACTATATC 
TTTTCTCTTT 
TAAATTCTTT 
CTGCAACATA 
TCCAAAATGC 
CCATAAAATA 
ATTACTTCAC 
GTGTAATTTT 
TATAAAAAAG 
TGTATGGAAT 
TTGTGCTTGA 
ACTTGATGAG 
TAACGCATCT 
TAATCCTGGA 
GCAACCTTCT 
ACTTGCTTTA 
TGCTTTACGA 
ATTTAATTCT 
TACAGATTCA 
TTTTCCTGTA 
TTCATCTAAT 
AGCATTCCAG 



GCGGTATTAG 
TCGCTAATAC 
TTTAGTGCAA 
AAACGCTTCC 
TTTGTGCAAA 
AAAATCAATA 
ATAGTTACCT 
ATGCCCAGTT 
AAACATTAAA 
TGCCATAATC 
GTCCTCAGCT 
TTTCGCTTTT 
GTACCTCTAT 
AAAATGCTTA 
TTATAGGTTG 
GATAGTATCT 
ACGCCtTCTT 
ATTAAGTTTG 
AAACCAGGTG 
TGATATTCAG 
TTATTATCGT 
CTCCAAATCA 
CGCATTTGcG 
ATTTTTTCTA 
TTAGAAGCAT 
GTAATGATTG 
CCTTTTTGaC 
TTCATGAAAA 
TCTTTGAACG 



GAAAATAATT 
GTTGATTTCC 
CATCTTTAAT 
AACCACTGTA 
CGATTGCATA 
TAAAGGTTTG 
TGGCCATCTT 
AAACCTAAAT 
GATGGTGTTC 
GTCAATGACG 
AACGGTGTTT 
TCGACATCAT 
TTCTCTATAG 
AAAATATGGC 
TAGCTACTCT 
TTCAACTATC 
TATCTTTTCT 
CAGGTAAATC 
TTGGTACACC 
TAACGATGTT 
AAGCATCTTT 
TAGCTAAATC 
CGAAACCTTG 
AAAAGTCTTT 
TTACGCGCTC 
TTAATGGAAT 
CTGCAGTATC 
TATCACCAGT 
TTTGAGCAAT 



CAATTCTTCC 
TTTTATAACT 
CGTAACCATT 
TAATACCAGT 
AAACGATAAA 
TATAAATATT 
GAAAATTAAA 
CATTTAATTT 
CTAGGTTCAC 
TTTCAGCATC 
CACTTACAAC 
CAACATATAA 
TACATGCTAT 
GGGATGCTTT 
ATCACACTCT 
CTTTAATCAA 
TTCGTACGTA 
AGCAGCACGG 
ATTTTGAACA 
TTTAAAGTAA 
GATCTTTTGT 
ACCAAGTTTT 
TGCATAAGAA 
CTTGTCACCA 
TTCTTTGaTT 
ACCTAATTCT 
AAGAATTTTT 
GATTTCAATT 
GTCTTCATGA 



ATAACTTTCT 
CTTGAAACTG 
TAATCACCTC 
TTAGT CACAC 
AATAAAACCT 
AAAATCGATT 
TACACGTTGA 
TGAGTATAAT 
TTCCGGGCTA 
TTTGGTAGGT 
AACAAATTCT 
CATAACTTGA 
CATAACACAG 
TAAGGTCAAG 
CTTTTATATT 
TATTATTCTT 
TGAGCACCGA 
TAACTATCAT 
CCAGTTGCGA 
GGATCTAGCA 
AAGAATTGTG 
AAATTCCATT 
CAAATTTTAC 
TCAAATGATG 
GAAGAGATAA 
AATGCGTTAA 
TCAACTAATG 
AAATAACTTT 
GACATGCCTA 



TCACTTTTGA 
TCGAAGGAGA 
CTGTTAATTT 
TTTCTAAAAA 
T CAT ACTGAA 
GTTTAGTCAC 
CCATTCATTT 
GCATCAAAGT 
TGCTTTTCAA 
GATACTTCAA 
AAAGTTTCTG 
TTTAACTTTT 
TAAATATTTT- 
GATAATACTT 
TATCAAAAGA 
CAATCCATTG 
AGTAGTCACG 
AGTAATTAAT 
CAACATCACG 
ATAAGTTTTG 
CACGGATAAT 
CATTATCTTC 
TCATATATAA 
CTTTTGGACC 
AACGTGCAAA 
TTGAAGTCCA 
CTTCTTTATT 
CTAATTCACC 
ATAATTCTTT 



5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480- 

6540 ' 

6600- 

6660 - 

6720- 

6780~ 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 
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CATTTTCACA 


TAGTGTCCAG 


CACCATTAGG 


TCCAATATAA 


GTAACACATG 


AAGCACCGTC 


7440 




TTTTGCCTTT 


GCAGCAATTG 


CATCAAGAAT 


ATCTGCAACT 


TTGTTATAAG 


CTTCTTCTTG 


7500 


5 


TCCACCCGGC 


ATTAATGACG 


GACCAGTTAA 


CGCTCCAATT 


TCACCACCAG 


AAACGC CCAT 


7560 




ACCAATAAAG 


TTGATTGCAC 


TTTGTGywAA TGCTTTATTA 


CGTCTGATAG 


TATCTTGATA 


7620 


10 


GTTTGTATTA 


CCACCATCAA 


TT AAAAT AT C 


TCCATCATCT 


AATAAAGGTA 


ACAAACTATC 


7680 


AATCGTTGCG 


TCCGTAGCTT 


TACCTGCTTG 


AACCATTAAT 


AAAATTTTAC 


GTGGTTTTTC 


7740 




TAAAGAATTA 


ACAAATTCTT 


CCAATGAATA 


CGTTGGATGA 


ATATTTTTCC 


CTTTTGATTC 


7800 


IS 


TTCAACCATT 


AAATCAGTTT 


TTTCACTTGA 


GCGGTTAAAT 


ACAGATACAC 


TATATCCGCG 

> 


7860 




TGATTCAATA 


TTCCAAGCTA 


GGTTTTTACC 


CATAACGGCT 


AAACGAATAA 


CTCCAATTTG 


7920 




TTGTGTCATA 


TTACTTACCT 


CACTTGTTGA 


TTTTTCATTA 


GTATTGTATC 


ACAAAATAGA 


7980 


20 


CATACACTAC 


ACTAAATCAT 


TTCGAATGTC 


GCGCAACTAT 


TTTGATTATT 


TCTAACACTT 


8040 




GACTTGCAAG 


CAAGTTCAAT 


GATTTAATCG 


GCATTCTCTC 


ATTTGTTGTA 


TGGATTTTTT 


8X00 




CATAACCCAC 


TCCTAAAATG 


ACTGAAGGAA 


TACCAAATGT 


ATTAATAATA 


CTGCCGTCTG 


6160 


25 


AACCGCCACC 


AGAAATAATT 


GTATTTGCAG ATAATCCTAA ATTACGAfiCA CTTTCTTGTG 


8220 




CAATTTTAAC 


AAC CGCTTCA 


TTATCATTAA 


TTTTAAATCC 


TGGATAACTT 


TGCTCCACTG 


8280 




TAACTACTGC 


TTTCCCACCT 


AATTCTGATG 


CAGTAGTTTC 


AAACACATCA 


GTCATATGTT 


8340 


30 


TGACTTGTGT 


TTTTATTCTT 


TCTGGATCGT 


GAGAACGTGC 


CTCTGCTTCT 


AAAATGACTT 


8400 




CATCTGCAAC 


AATATTCGTA 


GCTGAACCGC 


CATGAAACTT 


AC CAATATTG 


GCAGTAGTTA 


8460 


35 


TTTCATCAAC 


TTGTCCTAAT 


TTCATTCGAC 


TAATTGcTTT 


CGCCGCAATA 


TTAATAGCAC 


6520 


TAACACCCTC 


TTTTGGCGTA 


CTTGCATGAG 


CCGTTTTGCC 


AAAAATTTTA 


GCTGAAATTA 


8580 




ACATTTGCGT 


CGGTGCACCT 


ACAACCGTAG 


TACCGACATC 


AGCACTTGCA 


TCAATAGCAT 


8640 


40 


AACCAAAGTC 


CGCGTCCAAC 


AACTCTGAAT 


TTAATTCTTT 


AGCACCAATT 


AAACCTGATT 


8700 




CTTCTCCAAC 


AGTAATCACA 


AATTGAATTT 


GTCCATGTGG 


GATTTGTTGT 


TCCTTTATCA 


8760 




CTTGCAAAAC 


TTCAAGCATC GCTGAT AATC 


CTGCTTTATC 


ATCTGCACCT 


AGAATAGTCG 


8820 


45 


TACCATCAGA 


GTATATGTAG 


CCGTCATCTT 


TTACAATTGG 


CTTTACATTA 


ATTGCGGGTA 


8880 




CAACAGTATC 


CATATGGCTC 


GTCAAATATA 


ATTTAGGTAC 


TTCGCCTTCT 


TCGATAGTAC 


8940 




TATTCATTGT 


ACACACTAGA 


TTATTGGCAC 


CTAATTTAGG 


ATGTTTAGCC 


GCTTCATCTT 


9000 


SO 


CTTTAACATC 


TAACCCTAAT 


GCTATGAATT 


TTTCTTTTAA 


AATAGGTTGG 


ATTGTTGATT 


9060 




CATTCCCTGT 


CTCAGAATCG 


ATTTGTACAA 


GTTCAAAAAA 


CGTATTAAGT 


AATCTTTGCT 


9120 
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20 



2S 



30 



36 



40 



45 



SO 



GATGAAATAA AATGTTACAG TAATTGACGT TACACAGATT TATCAGGTTT GTAAATTGTG 9240 

T CAT ATT ATT TTCAATTTAT TATATATAAT TATTGTAACT CAAACTAAGC TTTGTCAAAA 9300 

ATATATTGAT TGATTTTTCA AAGATATCGT ATAATGAGGA AAATGACATA AGCAAACTTA 9360 

CTCATGTTTT TTATTATATT CCTTTATGAT GATTGCTAGT TATATCGTCT CAAGTTAAAA 9420 

GTTTTATATC TTATGTCGTA ATTATTAATA CAAAGGTTAT TCATTTGGAG GCACACAAAA 9480 

TGCAAAATAA AGTTTTAAGA ATTATCATTA TCGTTATGCT TGTATCAGTT GTATTAGCAT 9540 

TGTTATTAAC GAGTATCATT CCAATTTTAT AAACTATATC TCAACTACCT ATACAAAATC 9600 

ATACAATTAA AAATCCATCC ATTATAAACG CATGTATTAA TAAGTTATCG TATTGCAACG 9660 

ATTACTTTCA AACATGGGTC ATACGGATGG ATTATTTTTT AAGCTACTTC ACTATGCATT 9720 

TTCAATGAAC CAAATTGCGA TTTGATTTGT AAATATTCTT CTAATTCATT TAATATTTGA 9780 

ATAATACTTG CTCTCGAGTT AAGCGCTTTG TGTGTTGTTG GCAATGGCAG TTCATCCAAT 984 0 

TTCAAAGGCG TCTGATACAA ATTGTGTAAA CGCATTGCTG TATAGTCATT ACTATTCACA 9900 

TTTAGACCAA TTTCTTTCAG CAGTGACGCA ACATCATTTA AAAGCGGATC TTTATGACAG 9960 

ATACTTTCGA TGAGCGGTTT CATTCTCATT AACAATTCCA CTTGCTCTTC TCGCATATCA 10020 

AAATAATGAT AGTATGAATT TTCGTTTCTA ACAAAATGAT TTTTAACATC TCGGAACGCG 10080 

ATAGACTtCG CCTTTTTAAT ATTTAAAAGT AACACTTCAA ATTCAATCGC AATGGTATCT 1014 0; 

TCATATTTTT CACAAATATA ACTATATTTA CTAAAAATAT CAGCAATTTG TTGCTCAATT 10200 

TTACATTTGT ATTCGTCtAG TTGTTTGTCT AAACTTGGCA TCATTAAATT CaTTGTAAAT 10260 

GCAATGCTTA GTCCAATTAA CAGTAATAAT GTTTCATTAA CAATTAAATG TGCATCAATT 10320; 

GATTTTGCAT TAAAAACATG AAGTAATATA ACGCAACTCG TAATGACACC TTCTTGTACT 10380- 

TTTAATACGA CAGTTAATGG TATAAATAAC AATACGATAA TACCGAGTAC AATTGGACTC 10440 

TGACCTAATA AACTAAATAT TGCTGAACCT AAAAACAATA CTAAAAAACA TGATACTAAT 10500 

CTTGAAATAA TCGCTTGTAG CGAATGTACT TTTGTATGTT TAATACATAA TACGACTAAT 10560 

ATGGCGCTTG AAGCATAATT ATCTAAACCT AACAGCTTAC TAATAATTAC ACCTAAAGTC 10620 

ATACCCACTG CTGTTTTTAT TGTTCTAAAT CCAATCTTGT AAGGATTTAA CTTTAACATG 10680 

GGTTAGCGCC TCTTATCTTT CTTCACAATA TTTATTGAAT AATGTTTGTA ATTGATTAAT 10740 

TACGTTCATC ACATCATGAC CTTCGATTTG ATGTCTTTCA ATCATTTCTG TAATCTTTCC 10800 

ATCTTTTACT AATGCAAATG ACGGACTTGA AGGCGCATAA CCTTCGAAGT ATTCACGCGC 1086 0 

TCTTTGTGTC GCT T CTTTAT CTTGTCCAGC AAATACTGTC ACTAGACGAT CAGGTAATAC 10920 
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AGAATTGATC 


ATAACTAGTG 


TTGTACCATC 


TTGTTTAAGA 


ACTTTGTCAA 


CAT CTTCTG C 


11040 




AGTAGTTAAT 


TGCTCATATC 


CCGCAGATTC 


AATTTCATTC 


CTTGCTTGTT 


CTACAACACC 


11100 


5 


GTTCATGTAT 


AAATCGAAAT 


TCATGnCCAT 


AAGTTCAATC 


ACCTATCCCT 


TTATATTTAA 


11160 




ACTAt CCTCA 


TTCTACTAAT 


TAATAACATA 


TTGTTCAATA 


AACTAATCTG 


AATCACACCT 


11220 


10 


ATATTTAGAC 


ACAATTTTAA 


CAATATACCA 


AACATTATTG 


TGCTTAAAAT 


CATGGTAACT 


11280 




AATTTGTTCA 


CATGTTTTCA 


TTAATATGTT 


TCAAGTATGA 


TGTCTTATTT 


TGACTTTACT 


11340 




GCAAAAATGC 


ATTCAACCAT 


GTTGATTATT 


GTTCTTTATC 


TTTTTTGAAT 


ATATTGCACA 


11400 


75 


TATTTTAGTG 


CCAAAAAATA 


ATACATCCAT 


CGACAAGAAC 


AAGATAAAAC 


AAGTTGTCGA 


11460 




TAGATGCATC 


TATGTTATCA 


CTAATATATA 


TTTGTATTTT 


CTAAAGTATA 


CTGTTCGATA 


11520 




CGCTGTTTAA 


TATGATTCAT 


ArATTTACCT 


GTTTGTAAAC 


CATCTAAAAT 


ACGATGATCA 


11580 


20 


ATTGAAATAC 


ATAAATTAAC 


CATGTTACGA 


ATTGCAATCA 


TATCATTAAT 


TACTACTGGC 


11640 




TTTTTAACGA 


TTGATTCTAC 


TTGTAAAATC 


GCTGCTTGTG 


GATGATTTAT 


AATAC CCATT 


11700 




GATGATACTG 


AACCAAATGT 


ACCAGTATTA 


TTTACCGTAA 


ATGTACCGCC 


CTGCATATCT 


11760 


25 


TCAG CTGTCA ATTGCTTATT 


ACGCGCTTTC 


GTTGCTAAAG 


TATTAATTTC 


TCTAGCTATA 


11820 




CCTTTGATTG 


ACTTTTCGTC 


TGCATGCTTA 


ATCACAGGTA 


CGTATAATTT 


ATTTTCATCA 


11880 


30 


GCAACAGCAA TTGAAATATT 


AATGTCTTTA 


TGTAAGACAA 


TTTCATTTCC 


TTGCCAGCTA 


11940 


CTATTTAATA AAGGATATGC 


TTTTAAAGCA 


TCTGCTACAG 


CTTTTACAAA 


GAAAGCAAAG 


12000 




AACGTTAGAT 


TATATCCTTC 


tttattttta 


AAGCTGTTTT 


TATAATGATT 


TCTCGTATTC 


12060 


35 


ACAAGATTTG 


TAGCATCTAC 


TTCAATCATC 


ATCCATGCAT 


GTGGAATCTC 


TGTTACACTA 


12120 




TTAACCATAT 


TTTGCGCAAT 


TGCTTTACGC 


ACACCATTTA 


CTGGTATTGT 


GCTGTTTTCA 


12180 




CTATTGTCTT 


CAGATGATTG 


GTTACTTGAT 


GTATCTACTG 


ATGTTGATTT 


TGTTTGAACT 


12240 


40 


TGTTTGTCAG 


ATTGAGCTGT 


GGTACCACCA 


TTTTCAATAA 


CTGACATTAT 


ATCCTTCTTA 


12300 




GTTACACGAC 


CTTCAAATCC 


ACTACCTACA 


ACTTGTGATA 


AATCAATGTC 


ATGCTCTGAA 


12360 




GCGAGTTTAA 


ATACAACAGG 


TGAAAAGCGA 


C CATT ATT AC 


GTGGTTGATT 


TTGTTTAG CA 


12420 


45 


GTAGATGTCT 


GTTCCACTGT 


TGCACTAGCT 


TTnTAGTAG 


ATTTCTGAGT 


ATGCTCATCC 


12480 




ACTTTTGCTT 


GTATCTCTTC 


AGTTGTTTCA 


TTTGTCTTTT 


CATCAGCAGT TTCAATTTTA 


12540 


SO 


CAGATAATTG 


TATCAATAGC 


TACTGTCTGC 


CCCGCTTCAA 


> CTAAAATTTC 


! TGTAATTGTT 


12600 


CCTGATATCG 


TGGAAGGGAC 


TTCAGCTGTC 


' ACTTTATCTG 


I TAATAACTTC 


! ACATAATGGT 


12660 




TCATATTCAT 


CAATATGATC 


AC CAACAGAA ACTAACCATT GTTCAATGGT GCCTTCATGA 


12720 
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ft ft ft ^* ft fWW*W*« ft MVWM «h m 

AATTCACGCA TTTTATTTAA 


GATTTTTTCT 


GGATTCATCA 


TAATTTCATT 


TTCTAATACA 


12840 




UOAVjAAAA i vj VjCATAGATGG 


TACAtCTGGA 


GCAGCTAAAC 


GCATGATTGG 


TGCATCTAAA 


12900 


5 




ft ft *Wft ^ mji i_iiijimii 

AATAATCGCT 


GACACTTCTG 


ACATAATACT 


ACCTTCTAAA 


12960 




TTRTPTTPft^ ^H^p ft ft ft ft ft 

1 iAlv-i 1 L ALx TTACZAAGT AA 


ft ft ^IMVWt & jim^m*** 

AACTTTACCT 


GTATGTTTAG 


CACGATCAAT 


AATTGTTTCT 


13020 


10 


I AA i\j GATAAACAGT 


TCGTAAATCA 


ACGACTTCAA 


CATTGATACC 


GTCTGCAGCT 


13080 




AAAATATCLG CTGCTTGTAA 


ACAATAATTG 


AC CATT AAT C 


CATAACAAAA 


TACTGTTAAA 


13140 




lull LAC L IT CACGTTTCAC 


ATCTGcrrrr 


CCTAAAGGTA 


CAGTGTAATA 


TTCTTCTGGC 


13200 


IS 


ACTTCTTCCT TTAAGAAACG 


ATAAGCTTTT 


TTATGCTCAA AGTACAATAC 


TGGATCATTT 


13260 




GATTCGATAG ATGATAATAA 


AAGCCCTTTA 


f+ ft ft W ft ^v** 

G CATCAT ACG 


GTGTtsGAAGG 


AATAACAATT 


13320 




GTTAAACCTG GCGATGAAGC 


AAATATACTT 


TCAATACTTT 


GTGAATGATA 


TAGTCCTCCG 


13380 


OfS 
£w 


TGAACAC CG C CACCAAATGG 


TGCACGAATC 


ui 1 AA I LfOOL 


ATTGCCAATC 


ATTATTTGAA 


13440 




CGATAACGCA TTTTCGCAGC 


TTCACTAATA 


A 1 1 1 vjA 1 1 I Vj 


«T^^*»/^/*^ft /*•« /lyf »ft ft 

TCGCAGGTAA 


AATAAAATCT 


13500 




GCAAATTGAA TTTCTGCAAT 


TGGTCTTTTA 


/"■v^T* ft (^"•/" , ft t» ft r» 


LTGCACCAAT 


GGCAGTTCCA 


1356a 


25 


ACAATATTTG ACTCAGCTAA 


TGGCGTATCG 


n X nn^. 1 ^. X w i 


(-11 U\LU\TA 


TTTTTGTTGC 


13620 




AGTCCTTGAG TAGTACCAAA 


TACGCCACCT 


x a x w x fW-.\~J\r\ 




AAGAATAAAC 


13680 


30 


ACATCTTTAT TTWrrGTAA 


TGCTAAGTCT 


X w X V7v» V» l.Vj^v*J 


X A 1 \.VA.L lUl 


AAATAAGATA 


13740 


ATTTAGCCAT TAGTTAAGAC 


TCCCTTCTTC 


GT AC A C A AAT 
x X 




CTTCGACACT 


13800 




TGGATATGGC GCGTCTTCAG 


CAGCCTTTGT 


CGCTTTATTG 


ATGATGTCTT 


TnATgTCCGC 


13860 


35 


TTCTATTTCT GCCAACCAAG 


CATCATCGAT 


AATGCCAGCT 


GAAAGCAACT 


CTTTTTTGAA 


13920 




CTTTTCATTG CAGTCTGCTT 


TTTTAAGcGT 


TTCACCfCTCT 


TCTTTCGTAC 


GATATTGGTC 


13980 




GTOCTCATCT GATGAATGAG 


CTGTCATACG 


ACTTGTTACT 


GCTTCAATCA 


AAGTTGAACC 


14040 


40 


TTGACCAGAA ATAGCTCGAT CTCTTGCTTC 


TTTCATCGCT 


TTATACATTG 


CTAATGGATC 


14100 




ATTACCATCT ACTTGTTCAC 


CATGTATACC 


GTAACCAAGT 


GCTCTATCCG 


ATAATTTTTC 


14160 




AGCTGCGTAT TGTAATGAAT 


CAGGTACTGA 


AATTGCATAT 


TTATTATTTA 


TAATGACACA 


14220 


45 


TACAAAAGGA AGTTTGTGTA 


CACCCGCGAA 


GTTTAAACCT 


TCATGGAAGT 


CACCTTGGTT 


14280 




TGAGCTACCT TCACCAACAG 


TTGCTGTTGC 




TTACCATCCA 








AA'rrrxvrrc 


TTTTTAAAGC 


14340 


SO 


TAAAGCAGCA CCAACAGCAT 


GOGGTATTTG 


AGTTGCTACC 


GGTGAACTTT 


GAGACAAAAT 


14400 


ATTCTTAGCT CTACTACTAA 


AGTGTGATGG 


cArrixi'i-rrr 


CCACCAGAGT 


TAACATCGTC 


14460 




TTTCTTTCCA AACGCTGATA AAAACGTATC 


ATACGCTGAG 


ATACCCATAT 


AAGTAACGAA 


14520 
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AATCTGAGTT GCTTCTTGTC CTTGACCACT TACAACAAAT GGAATTTTAC CTGCACGGTT 14640 

CAATAACCAC AGTCTTTCAT CTATTTTTCT ACCTAAATCC ATCCATTTAT ATATTACTTT 14700 

TAGGTCTTCT TCGCTAAGGC CTAATGATTT ATAATCAATC ATGTTAAATC CTCCTATTTA 14760 

TACGTGAATA GCTCTACTTT CTGCTTTCAA TCCTAATTCC ATCAACACTT CAGAGATGGA 14820 

AGGATGTGCG TGTGTTGTTA GTCCTAATTC TAATGCCGAG CCATTCATGA ACTGTAACAG 14380 

TGATGCCTCA TTAATCAATT CTGTTACATG TGGACCAATC ATATTAATAC CCACAATTTC 14 940 

TTCAGTTGAT TGATCAATCA CCATTTCGCT ATACCCTTCG TTTGTGTCAT GGCTATCAAT 15000 

CACTGCTTTA CCAATTGCTT TAAATGGTAC TTTAAAACTT TTAACTTTCA TTCCCTCTGC 15060 

CTTTGCTTGT TCAATGTTTA AACCGATAGA AGCAATTTCA GGTTGTGAAT AAATACACTT 15120 

AGGCATCATG TTATAGTTTA CTGGGATTGG GTTCCCCTCA AACATATGAT CAACAGCCAC 15180 

AACACCTTCT TTTGATCCAA CATGTGC CAA TTGTAATTTT CCTATACAAT CACCAGCTGC 15240 

ATAAATATGT TTATCTTCAG TTTGTTGAAA TTCGTTCGTT AAAATATGTC CTGATGTTGa 15300 

AAGtTTTATT TTAGTGTTGT TTAAACCAAT ATCTGATGTG TTAGGTTTTC TACCAATCGA 15360 

TAGCAACACT TTATCTACTT TAATTATGTC TGAGGAAATT TCAAACGTAA CACCATCTTC 15420 

GTTAACATTT ATAT CATTTT CAGAAAGTTT TATTCCCTCA TAGAATTTAA CACCACGTGC 15480 

TGACAATGAT TTTTTTAATA GTTGTGAAGC TTGTTTACTT TCAGTTGGTA AAATTCTTTC 15540 

ACCTGCTTCT ATAACTGTTA CGTCAACACC TAAATCTATC ATCAATGATG CAAATTCCAT 15600 

TCCGATAACA CCACCACCAA TAATACCAAT ACTTGATGGT AACGTCTTTA ATGATAATAT 15660 

ATCATCGCTA GATAAAATTT TATCATGATC AAATGATAAG AATGGCAACT CTGCAGGCGA 15720 

AGAACCAGTT GCAATTAATA CAAATTGGTT GGGTAATAAG TCTGATTCAC CATCTTCATA 15780 

TTCGACAGAA ATTGTGCCAC TTTGAGGTGA AAATATAGAT GTACCTAGAA TACGTCCCGT 15840 

GCCATTATAA ATGTCAATGT GATTGTGTTG CATTAAATGC TTTACACCTT GATACATTTG 15900 

ATTAATAATG TCTTCTTTTC GTGCCAACAT ATTTTCAAAA TTAACATTAG CATCTTTGAC 15960 

ATCAACGCCA AACATTGCTG CCTGTTTTAC TGTTTGAAAT ACTTCAGCAG ATTTAAGCAG 16020 

CGATTTAGTA GGAATACAAC CTTTATGGAG ACAAGTACCT CCTAATAGTT GTCGTTCTAC 16080 
TATTGCCACT TTTTTACCTA ATTGAGACGC ACGTATCGCA GCAACATATC CTGCAGTACC 16140 
TCCACCGAGA ACGACTAAAT CATATTGTTT CTCTGACATG TTCTTACTCC TAACTAATGA 16200 
TATATATCCA TTGAAAATTT ATTAATACAT AGTTTTCATG TCCATTAATT ACCTATTTTA 16260 
CATGATTGTC TATTTAGTTT GAATGCACAT AAATAAATCC ATAAATGAGT ATTCAACACA 16320 
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TAAATCAGTA ACACTTGCAC CTGAAATCAT TCGTGCAATT TCATCTACTT TATCATCGCT 16440 

AATTAACTCT TGAACTTGTG TTGTTGTACG ATCATCTTTT GATGATTTCG AAATTAATAA 16500 

ATGATGGTCG CTCATCGATG CAACTTGTGG TAAGTGAGAG ATACAAATAA CTTGTATATA 16560 

TTCTGCTaTA TCTCGCATTT TCTCTGCCAT TT 16592 
(2) INFORMATION FOR SEQ ID NO: 54: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13794 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 
is CD) TOPOLOGY: linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

CCAATACAAC GTAAAAAGAT TGCTTGTGTT ATTAATGAGT TAGATAAAAT AATTAAAGGA 60 

TTTAATAAGG AAAGAGACTA CATAAAATAT CAATGGGCTC CAAAATATAG CAAAGAnTTT 120 

TTTATACTTT TTATGAACAT TATGTACTCA AAAGATTTTT TAAAATATCG ATTTAATTTA 18 0 " 

ACATTTCTTG ATTTATCTAT CTTATATGTA ATATCATCTC GAAAAAATGA GATACTAAAT 240 

TTAAAAGATT TGTTTGAAAG TATTAGATTT ATGTATCCTC AAATTGTTAG GTCAGTTAAT 300 

AGATTAAATA ATAAAGGTAT GCTAATCAAA GAACGATCCC TTGCAGATGA AAGGATTGTG 360 ' 

TTAATCAAAA TAAATAAAAT ACAATATAAC ACTATTAAAA GCATATTCAC AGATACTTCC 420 ; 

AAGATTCTCA AACCAAGAAA ATTTTTCTTT TAAATTTAAA CAGATTTACC TCTTGATAAA 4 80 

ATAAATAAGC AATCATACTA CTTCTCAATT TAGTATAAAT AAAAATACAT AATTAACTTT 54 0 

crrnxarriT tatattattt caatacccta ctatatatca caacacataa attaagcatg 600 

ACACTCATTC AATTTAGTTC ACCATTTCGT GTTCCAATTT TACTGAGTAT CATGCTTTTA 660 

40 A TGTTATAAA CCTAATGCTT TAATAAATCG TGTTAATTCT TCTCGCATAC TGTCATCTTT 720 

CAATGCATAT TCTATGGTAG TTTTAACGAA GCCTAATTTT TCTCCAACGT CATAACGTTC 780 

GCCTTCGAAG TCATATGCAT ACACTTGGTT ATCATTATTC ATACGTTCAA TCGCATCTGT 840 

TAACTGAATT TCGTTACCTG CGCCTTCTTT TTGCGTTTTT AAATAATCGA AAATTTCAGG 900 

CGTTAATACA TAACGTCCCA TAATAGCTAG GTTTGATGGT GCCGTACCTT GTGCTGGCTT 960 

TTCAACAAAC TTTTTCACTT CATACTGACG TCCGTTTTTA GTTAATGGGT CAATAATTCC 1020 

ATAACGATGA GTATCTGCTT CCGGAACTTC TTGGACACCT ATAACTGAGT GCCCTGTTTC 1080 

TTCATAAACG TCAATCAACT GTTTCACTGC TGGCACTTCA GATTCAACAA TATCGTCACC 1140 

55 



413 

NSOOCIO: <£P 0786519A2_I_> 



10 



IS 



20 



EP0 786 519 A2 

TAAACCTTTT TGTTCTTTCT GCCTTACATA AAAAATATTC GCAAGTTCCG TTGAATACTG 1260 

AACTTTCTCT AGTAATTCAG ATTTACCTTT TTCTTTTAAC ACCATTTCTA ATTCTTTTTG 1320 

ACTATCAAAA TGATCTTCAA TCGCGCGTTT GTGGCGACCT GTCACTATAA TAATATCTTC 1380 

AATTCCAGCT CTTGCAGCTT CTTCAACGAT ATATTGTATT GTGGGTTTAT CTAAGATAGG 144 0 

AAGCATTTCC TTTGGCATCG CTTTAGTTGC TGGTAAAAAT CTAGTCCCTA AACCAGCAGC 1500 

GGGAATGATT GCCTTTTTTA TTTTTTTCAA AGTTAATGTG CTCCTTTTCC TAAGTATTAA 1560 

ATCTATGTAT CAACGTCATT TTAACACTAA TTAGAACGCC TTCATAGTGT CATTGAGTAT 16 20 

GTAATTATTT CTTGGGAAAT TTGTTTTAAT TTTAAAAAAC AGGCTTACTT CATATAATTT 1680 

ATGAAATAAA CCTGTCAATT TTGGATTGAT TATGCTTTGT GATTCTTTTT ATTTCTGCGT 1740 

AATAACGCTA AACCTAAAAT GCTAAATAAT CCGCCGAACA ACATGCCGTT GTTTGTTGAT 1800 

lUi i- ^- X V_\— f\\- A X *-rtOO invj X 1 \— rtVjrt i i 1L1 i AVji-V X 1 Vj X VjV_ Xllill AVj X 1 VjVj li\CV- X B 6 U 

ACTGCTTTAA CCTTTTCATT GATTTCAATA ACAGGTGTTA CTACTTTACC TTGTTCCACT 1920 

GGTTTAGAAG GTTTTTTAGG TTCTTCTTTA GCAGGTGGTA TTGGTTTACC AGGTTCAGTT 1980 

2S GGTACCTCTG GCGTTGGCGG TGTTGGTGTT TCCGGCTCGC TTGGTACTTC TGGTGTCGGT 204 0 

GGTGTTGGTG TTTCCGGCTC GCTTGGTACT TCTGGTGTCG GTGGCGTTGG TGGCACGATT 2100 

GGAGGTGTTG TATCTTCTTC AATCGTTTGT TGACCTTCAT TATGACCACT TACTTGTGGA 2160 

AGTGTATCTT CTTCAAAGTC AACACTATTG TGTCCACCGA ATTGATAATT TGGTTTATCT 2220 

TTATTTGTAT CTTCTTCAAT AATTTCAGTG TGCTTATTGA ATCCGTGAAT ATGTGGCACA 2280 

CTGTCGAAGT CGATATCAAT GATATTACCA CCTTGTTCAT ACTTAGGTTT GTCTTTCTCT 2 340 

GTATCTTCTT CGAATGATTG GTTACCATTA TTTTGACCAT GAATTTGAGG TACACTATCG 2400 

AAATCGATAT CTACGATATT GCCACCTTGT TCATATTTCG GTTTATCTTC TTCTGTGTCT 24 60 

TCCTCAAATG ACTGATTACC GCTATTTTGG CCACCTTCGT AACCTAATTC ACTCTTAATA 2520 

TCCACGTGGC TATTTTCTTC GATTTCTTCA ATCACGCCAT AATTACCGTG ACCATTTTCA 2580 

GTTCCTAAAC CAGAATGAGA AATATGATGA TTGTTTTCAG TAATTTCCTC GATTGGTCCT 2 640 

45 TGCGCTTGAC CATGTTCTTC AGGTAGTTCA TCTACTAGTT CAATCAGATT ACTTTCAGTC 2700 

GTATATTCTT TCGTATCTTC AATTGTTGTA TGATCGCTAA CAGCACCAGT TACAATACCT 2760 

TTTGTAGAAT CTTCGTCAAA TTCAACTAGG TT AG ACT CAG TAGTAACCTG ACCACCACCT 282 0 

50 GGGTTTGTAT CTTCTT CATA TTCAACAACA TCAGCATGAT GTTTTGAATT TTCATGTGTC 2880 

GATTCTTCAA AGTCTACATG AATAGAATCT TCTTCAGTTT CAATGGTACC TTCTGCATGA 294 0 

55 



30 



3S 



40 



BNSOOCID <EP 0786519A2_I_> 



414 



10 



20 



25 



EP0 786 519 A2 

TCTTCGATTG TACCAGTCAA TTCATGCTTC TCCACTGGCG GCTCTGATTT AAATTCAAGT 3060 

TCGATAGGAG TACTATGTTC TATAATAGGT TCCTTTAGTT TATCTTTGCC GTCGCCTTGA 3120 

GCGTTATTAG AGTAAAATGC AACGCCATTT TTCCaAGTTA AATTACTTGT ATAATAATAG 3190 

TTATAATATC CAAAAAGGTG TGTTTGAAAT TCTAAGTTGC TAG CATTTGA ATCATAATAC 324 0 

CCTTCATATT TTATTACATA ATTTTTACTT TGGTCTAAAT TATTAAAGTT TAAAGAATAA - 3 300 

CCACCATTAG TATCAAAATC TAAACTCATA TTATCAGTCA CATCTTCAAA TTTGCTGACA 3 36 0 

TCATCAAGCT TTGCATAnTn AgctTTCAGC TAAATCGTCT GAACCAATGT GTTTATATAC 3420 

15 CTTAACTGTT GGATTATTAA CCCCTGGTTT ATTTCCTTTA GTTACTTGAC CAGTTACTGT 34 80 

CACAGAGCTT AACGACTGGT TGTTAGGTTT CATGTACGCA AAATGACTAA ATTTCCCATC 3 54 0 

TACTTTATTT AAAGTATCAA TTCGACCATT AGCTGTTACT CCCCAATTAT CTCTAACTCC 3600 

ACCTAAATAT TGAATATTAA ATATTTTGCT AACCGTAGTC TCACCCAATT TAACTTCAAC 3 660 

ATTTTGGTTA CCTTTTTGCG TCACTGTTGT AGGATCAATA AATAGATTTA AAGATAATTC 3720 

AGCAGTTAAA TCTTTCTTTT CTTGTACATA TTCTTTAAAC GT AT AT CT AA CTTTTCTTTC 3780 

TCCAATTATT TCTCCTGTCG CCATAACTTG ACCATCTGTA CTTTTTATCT CCGGAACTTT 3 840 

ACGCAGTGTT GAGATACCAT GAGTTTCAAC ATTATCGCTT AATGTGAAAT CAAAATAATC 3 900\ 

TCCCGCCTTA ATTCCTTCTC CAAATTTCCA TTTATATTTC AAGGTTACTC TTTCTGCGTT 3 960 - 

ATGAGGATTT ACAACATTCG TATCTTGTTT ATGTCCTACA ATTTCACTAC CTTCTTCTAC 4 020 

TTCCACTTTA TTTGTTACAT CTGTACCTGT CGCTTTAGTT TCTTCCACTA CTTCTTTCTC 4 080: 

TGCAACTGCT GTAACGTCAt TGatCTTTTC ATTCTTGGTT TAATTTCTGA GACGTTACTT 4140; 

GGTTGAGCTA TGTCAACTTG AGTTCCTGTA GTTTCCTTAT CAGCAACTTT TTCCGATGGC 4200- 

AAATCAACTC GCGAAgTTTC TACTTTTGGT GCTTGCAcAG TTTTCGGTGC TTCTTCTGTT 4260 

40 GTTACTTGTG TTGATTGTGA TGGTTGCTCA GTTGATGTCG CGCTGTATGA TTGTGTTTCA 4 320 

TCTATTGTAT TAACGTTATT TGTAGTTGTT TGTGTTTCGC TTGCTTTACT TTCAGTAGCT 43 80 

GAACTCCCAC TTTCCTCTAC TGTAGTATTG TTTTGTTCCG ATGCTGCAGC TTCTTTTTCT 444 0 

TGTCCCATTC CAACAACGAT CATTGTTCCT AAGAATACTG AGGCCGCTCC CAATTTGTGT 4500 

TTTCTTATGC CGTATCTAAG ATTGCTTTTC ACTATAATAT TCTCCCTTAA ATGCAAAATT 4 560 

CATTTATTTT TAAAACTCAA TAAATGCAAT TCTATATTGT TCGGTTTTTA AAAGCAATGA 4620 

AAAAAAGCGA GTTAATAAAA AGTTAAGATT GTTGTTAACT TTATGTATAA TGAGTTTTTT 4 6 80 

ATTATTTGAA ACTCACATAT AT ATTG CAT A CAAAGCTCTT GAACACCTTG ATATAACAGG 4 740 
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TACTAAACCA TACATAATAA TCGCCTGTAC AATGCATCAT TAACAAGTCA CTGAAACGCC 4860 

TTTCATTGTA TTAATAACGT CACTATAATT TTTATATCGT TCGGTTTTTG TTTGATTTTA 4 920 

5 ATGATTATTT ATACAAAAAC AGCCGTATTT CAAGCCGACA TTTTAAATTT AACTAAATTT 4 980 

GCATCTAGTT AATAATTGCA TTTATCAAAT TTGTCTTATT GATCCAATCT AATTTGTACT 504 0 

CACAAACTAG TTTAAAATTC TAACTTTATC TCTCAGTTCG TTATCAATCA TCAGACATAA 5100 

10 

ACCAATGAAG CAATCAGAAA ACACTCTAAT TTTCTATTAG AAATTTGATT TAATATAAAA 5160 

AAACAGGCTT ACTTCATATA ATTTATGAAA TAAACCCGTC AATTTTTGTT TAATTATGCT 5220 

1S TTGTGATTCT TTTTATTTCT GCGTAATAAT GCTAAACCTA GAATGCTGAA TAATCCGCCG 5280 

AACAACATAC CTTTGTTTGT TGATTCTTCT CCACCTGTTT CAGGTAGTTC AGATTTCTTA 5340 

GATTGTGGTT TTTTAGTTGG TGCCACTGCT TTAACCTTTT CATTGATTTC AATAACAGGT 54 00 

20 GTTACTACTT TACCTTGTTC CACTGGTTTA GAAGGCTTTT TAGGTTCTTC TTTGGCAGGT 5460 

GGTACTGGTT TACCAGGTTC AGCTGGTACC TCTGGTGTTG GCGGTGTTGG AGTTTCTGGC 5520. 

TCACTCGGCA CTTCTGGTGT CGGTGGTGTT GGTGTTTCCG GCTCACTTGG TACTTCTGGT 5580 

25 GTTGGTGGCG TTGGTGTTTC CGGCTCACTT GGTACTTCTG GTGTCGGTGG CGTTGGTGGC 5640 

ACGATTGGAG GTGTTGTATC TTCTTCAATC GTTTGTTGAC CTTCATTTTG GCCGCTTACT 5700 

TTTGGAAGTG TATCTTCTTC AAAGTCAACA CTATTGTGTC CACCGAATTG ATAACTTGGT 5760 

30 

TTATCTTTAT TTGTATCTTC TTCAATAATT TCAGTGTGCT TATTGAATCC GTGAATATGT 5820 

GGCACACTGT CGAAGTCGAT ATCAATGATG TTACCGCCAT GTTCATACTT AGGTTTGTCT 5880 

TTTTCTGTAT CTTCCTCGAA TGACTGATTA CCTTTATTTT GACCATGAAT TTGAGGTACA 594 0 

35 

CTATCAAAAT CGaTATCTAC G ATATTG CCA CCTTGTTCAT ATTTAGGTTT GTCTTCTTCT 6000 

GTGTCTTCCT CGAATGACTG GTTACCGCTA TTTTGGCCAC CTTCATAACC TAATTCACTC 6060 

40 TTAATATCAA CGTGGCTATT TTCTTCGATT TCTTCAATCA CGTCATAATT CCCGTGACCA 6120 

TTTTCAGTTC CTAAACCAGA ATGAGAAATA TGATGATTGT TTTTAGTAAT TTCCTCGACT 6180 

GGTCCTTGTG CTTGACCATG CTCTTCAGGT AATTCATCCA CTAATTCAAT CAGATTACTT 6240 

45 t CAGTTGT AT ATTCTTTCGT ATCTTCAACT GTTGTATGAT CGCTCACtGC GCCAGTTACA 6300 

ATACCTTTTG TAGACTCTTC GTCAAATTCA ACTAAGTTAG ACTCAGTAGT AACCTGACCA 6360 

CCACCTGGGT TTGTATCTTC TTCATATTCA ACAACATCAG CGTGATGTTT TGAATTTTCA 6420 

50 TGTGTAGATT CTTCAAAGTC AATTGGATTT GATTCCTCAG AGGACTCAGT GTATCCTCCA 6480 

ACGTGACCTG ctTCGCTATC CACAG CAGTA TGGTAATCGA TATCAATAGC TGATGAATCC 6540 
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TGGTAATCAA TGTCAAGAGT TGATGAATCA TATTCCTCTT CAACAGTAGT TACTAAATTC 6660 

TTATCATATT GACCTGTAAG AGTTTCTTTA ATTGTATCTT CTTTATATTC AAATTTATTA 6720 

TTTTGAATAA TCGGACCATT TTTCTCATTT CCGTTCGCTT TATTACTGTA TAAAACTAAA 6780 

CCATTATCCC AAGTTAAGGT ATATCCTCTA TCATAATAAT ACTTATAAAG TTGCTCTGGA 6840 

TGTCCTACCA TTTGTGTTCT AAAATCAACT TCATCAGTAC CATTTAAATA CTCTCCATCA 6 900 

TAGTGAACAA CATAAGTTTT ATCTAGATTT TCTATATTCA ATGAATAGCT TCCATTATTT 6 960 

TGTAAATTCA AATTCCCACT CATATTACTT GTGACTTCTT TAAATTTAGA AGTATCTGTC 7020 

1S GTATTTGCAT ATACACTCTT CGCTATGTCT TCATTATTAC CCAAGTATTC AAATATCCTA 7080 

ACTTTTGGTT GATTTCCATT CTGATTACTA CCTTTCATTA AAGTTCCAGT AACAGTCACA 7140 

CTTGTCGTTT TACCATTATT AGGTTTAATA AATGCAACAT GCGAAAATCT ATTATTCGCT 7200 

20 TTATTAAATG TCTCAATCGA TCCATTTAAA TTGGCATAAT AATTCCCAAT ACCATCTTTA 7260 

TATTTAACAT CTAATTCCTT TGAAGTTTGT TCTTCATTTA GTGTTGAAGT TATAGTTTGA 7320 

TTTC CATTAG TTTGTACAGT TTTAGGATCA ATAAATAAAT TAATTTCTAG TTCAGCCGTT 73 80 

ACATCAACCT TATCTTCAAT ATCATTTGTA AATGTATATC TAATCTTTCC ACCTTCTAAA 744 0 

ACTTCAC CTG TCGCCATTAC GACTGAACCA TTTTTAATTT CTGGTACTTT TCTAGCAGTT 7500 

GATACGCCAT GCGTATTTAC ATTATTTGAT AAAGTAAAGT CAAAGTAGTC ACCTTGATGT 7560 

AAACCATTCT CAAATTTCAA CTTATATTTT AGTACCGCTC GTTGTCCTGC ATGAGGTTCT 7620 

ACTTTATTTG TATTGTTATG CCCCTCAATA GAACCAATTT CTACTGTAAC TTTACTTGTT 76 80 

ACATCTGTAC CCGTTTCCAC TTTCGCGTTA CTAGCTTCCT TAGCTTCCGC TACATCTGCT 774 0 

GATCTTGTCA CACGTGGCTT ACTTTCTGAT GCCGTTCTTG GCTGTGCCAC TTCAACTTGT 7800 

GTTTCTGCGA CTTGATTTTG TGTAGCCTTT TTAGGTGTTA AATCTACTTG TCTTTGATCT 7860 

CCGCTATTGT CTTGAGATTG TGTTGTTTCC TTAACTTGAG GTTTCGCTTC TTCCTTAACT 7920 

ACCTCTTCTT TAACTGTTTC TATATTTGCT GGTTGTGCAG TTTGTGGTGC TTGTACTGCT 7980 

TTTGGTGCTT CTTCAGTTGT TACTTGTGTT GCGTTTGACG GTTGTTCTGT TACTGTTGCG 8040 

45 TTATATGATT GAGTTTCTTC TATATGATTA ACGTTAGTTG CAGTTGTTTG TGTTTCACTT 8100 

GTTTTATTAT CAGTAGCTGA ATTCCCATTT TCTTCTACTG TAGTTGTCTT TTGTTCTGAT 8160 

GCTGCAGCTT CTTTGTCTTG TCCCATCCCA ACAACGATCA TTGTTCCTAA GAATACTGAT 8220 

SO 

GCTGCTCCCA ATTTATGTTT TCTAATGCCG TACCTAAGAT TGTTTTTCAC TATAATATCT 8280 

CCCTTTAAAT GCAAAATTCA TTAATTTTTT AAACTTAATA AATGCAAGTC TATATTGTTC 834 0 
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ATGTTAATTG 


ATAATTTTAT 


TATTTGAAAT 


ATACCTATAA 


ATTGTATTCA 


AGTCATCAGA 


8460 




AACCCTTGTC 


ACACAAGGCT 


TGTATTTTTT 


ATACTTATTT 


TTTAAATTAA 


ATT CATCATT 


8520 


5 


ATCTAATTTA 


AAACAATATA 


CTAAACGTTT 


CATAATTATC 


GCCTGTACAA 


TACGCACAAA 


8560 




AACATGTCTT 


GAAACGCCTT 


TCATTACTCT 


AAAATACCCA 


ATATACTTTT 


TATATCGTTC 


8640 


10 


GGATTCTGAG 


TATTTCAGAC 


GATTTTCTGC 


ATAAAAATAA 


ACGTGTTTCA 


AGGCAATATA 


8700 


TTGCAATTAC 


CTAAAAACAC 


GTTTACTTAA 


TATTTAGTTA 


AACAAATAAG 


CTAATGAATA 


8760 




AAATGAAGAT 


GATACCTGAA 


ACGGAAATAA 


TCGTTTCTAA 


TAATGACCAT 


GTTAAGAATG 


8820 


15 


TTTCTTTTAC 


AGTTAAACCA 


AAATATTCTT 


TAAACATCCA 


AAATCCTGCG 


TCATTTACAT 


8880 




GAGACAAAAT 


CACACTACCT 


GCACCTATCG 


CAAGTACAAC 


TAATGCAACA 


TTTACATCTG 


8940 




ATGATTGTAA 


TAATGGTAAG 


ACAATACCTG 


TAGTTGAAAT 


CGCAGCTACT 


GTAGCCGAAC 


9000 


20 


CTAATGCGAT 


ACGTAGCACA 


GCTGCAACAA 


TCCATGCTAG 


TAAAATCGGA 


GACATCTCTG 


9060 




TACCTTCAAA 


CATTTTAGCA 


ATTGTATTTC 


CGACACCGCC 


GTCAATTAAT 


ACTTGTTTAA 


9120 




ATGTACCGCC 


ACCGCCAATA 


ATCAATAACA 


TCATTCCGAT 


TGGATAAATC 


GCATTCGTCA 


9180 


25 


CTGATTCCAT 


AATATGATTC 


ATCTTACGCT 


TTCTCATTAA 


TCCCATCGTA 


ACGATTGCAA 


9240 




ATAATACTGC 


TATTAGCATG 


GCTGTCCCTG 


CTGTTCCTAT 


CATATAAATG 


ATAGATTCAA 


9300 




ATAGATTTGT 


AGGTTTGTCA 


TGCCCAGTTA 


CAAGTTGCGT 


TATCGTAGAC 


ACTAACATTA 


9360 


30 


ATATGACTGG 


TAATGTTGCT 


GTTAATAAAC 


TCATACCAAA 


TCCTGGCATC 


TCTTGATCCG • 


9420 




TAAATTCTTT 


TTGTGCACCT 


AACGCTGAAA 


TATCGCCTTC 


TCGTGTATAC 


GCAGACGGAA 


9480 


35 


TCATTTTTTG 


TGCAcTTTGT 


TAAATATAGG 


CCCTGCAATG 


AGTGTAACTG 


GaATGGCAAT 


9540 


AATCATACCA 


TACAGTAATA 


CATCTCCAAC 


ATTTGCCTTT 


AATTCTTTTG 


CGATGACTAC 


9600 




CGGTCCTGGA 


TGTGGTGGTA 


AAAAGCCATG 


TGTCACTGAT 


AAAGCTUTTA 


C CAT AGGTAG 


9660 


40 


TCCTAGTTTT 


AACACTGAAA 


CATTTGCGCG 


TTTTGCTACT 


GTAAATACTA 


ATGGAATCAG 


9720 




TAAGACTAAA 


CCTACTTCAA 


AGAACAATGC 


AATACCGACG 


ATAAATGCTG 


CAACAAGCAT 


9780 




TGCCCATTGT 


ACATGTTTTT 


GACGAAATTT 


TTGAATCAAC 


GTGTCTGCGA 


TTCGAGTTGC 


9840 


45 


ACCACCACCA 


TCAGCAAGCA 


ATTTCCCAAG 


TATGGCAWC1 










GTGGCCGAGC 


GTACTGCCCA 


TTCCTTTCTC 


AATCGTCTCC 


ATAATTTTAG 


TCAATGGTAT 


9960 




ACCTAGCATT 


AACGCTGTAA 


TCATCGATGT 


GATAATTAAT 


GAAATAAATG 


TATTTAATTT 


10020 


50 


AAACCCAATA 


ATTAATACT A 


ATAAAATAAC 


GATACCTAAA 


ACAACACTGA 


TTAACGGCCA 


10080 




TATTTCGTTA 


AACATGACAT 


TCCCCTCTTT 


CTCTTTTCAA 


TAGAATGTAA 


CACCGTCGTC 


10140 
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GAGTGACGTA TTTATTGTGT TTTATTTTCA GCGATATGTT GGCGTTGAAA ATCTGCAATT 10260 

TGTTCATAAT TCTCTGTTAA AGAACGACTT AAATTGATAA AAATGGATAC GATCTCTTGG 10320 

TAAACAGTGA CATTTTCTTC AATCGGCGTA TGATTGTTTG TGGCACCGAC CATCGATGAA 103 80 

ACGATTGAAA AATCTTCAAT GTCACCTACA GCTTTAAGTC CGAGCACGCA GGCACCTAAG 1044 0 

CATGAACTTT CATAACTTTC AGGAACCACT AACTCTGTGT CAAATATATC TGACATCATT 10SOO 

TGACGCCATA CTTCACTTTT CGCAAAACCA CCTGTTGCTT TTATCATCTT AGGTGTTTCA 10560 

TTCATTACTT CAATAAGCGC AAGATAGACG GTATACAAAT TGTAAAGAAC ACCTTCTAAT 10620 

/5 GCAGCGCGAA TCATATGTTC TTTTTTATGA GATAAAGTTA AACCGAAGAA TGAACCTCTT 10680 

G CATTTGCGT TCCAAAGCGG CGCACGTTCT CCTGCTAAAT AGGGATGGAA TATTAAACCA 10740 

TCTGCACCTG GTTTAACACG CTTTGCAATT TGAGTTAAGA CATCATAAGG ATCAACACCG 10800 

20 AGACGTTTCG CAGTTTCGAC TTCACTCGCT AGCAACTCGT CGCGCAACCA TCTCAATACG 10860 

ACACCACCAT TATTTACAGG ACCTCCGATG ACGTAGTGGT CCTCTGTTAA GACATAACAA 10920 

AATATTCTAC CTTTGTAATC AGTACGCGGT TTATCTATCA CAGTACGAAT CGCCCCAGAT 10 98a 

GTACCGATTG TGACAGCAAC TTCTCCTTTA CCAACACTAT TGACACCTAA ATTAGAAAGG 11040 

ACCCCATCAC TCGCACCAAT AACAAACGGT GTATCTTTAT TAAGCCCCAT TAATGTTGCA 11100 

TAACGTTCTT T CAT AC CTTT CAtCACATAC GTTGTTGGAA CTAATTCCGG CAACATTTCC 11160 

TTGGAAATAC CCAGCAGTTC TAATGCCTCA ACATCC CAAT CTAATGTTTC TAAATTAAAC 11220 

ATCCCTGTTG CGGAAG CCAT TGAATAATCA ATGATATATG TATCAAATAA ATGATAGAAA 112 8 0 

ATGTATGTTT TAATATCTGC AAACTTAGCA GTACGTTGAA ATACATCTTG CCATTCATGT 1134 0 

TTCATCCAAA AAATCTTCGC TAATGGCGAC ATAGGATGAA TCGGTGTGCC TGTTCGCTGG 114 00 

TAAATCGCAT TGCCATCATG CACTTCATTT ATTACTGTTG CATATTTTGC AGCGCGGTTA 114 60 

40 TCTGCCCAAG TAATATTATT TGTTAATCTT TGATGTTGCT GATCCATCGC AATCAAGCTA 11520 

TGCATTTGCG CACTAAATGA CACAAACTTA ATGTCGTCTT TATTAACTTT GGATTCTCTC 11580 

ATAACATATT TAATAGTCAT TAGTACTGCA TCAAATAATT CATCTGGGTT TTCTTCTGAG 11640 

45 ACATCAACGT TTGGTGTGTG TAAATCATAG CCTATTTGAT GTTTCATGAT AAAAGTTCCA 1170 0 

TTTTCATCAT ATAAGACTGA CTTGGTACTC GTCGTTCCAA TGTCGACACC AAT CAT ATAT 1176 0 

TTCATGATAA ATCCTTCTTT CTTTCATTTT AATTCAACCA AAATCCTTCA ATATCTTTAC 11820 

SO 

CAACATCGTC GAAATTTAAA TGAAACGCTT CTTTCAAAAT TTGACTGTCG TATTGTTCCA 118 8 0 

CTGCATCAAT AAACACTTGA TGATTATGAT GTATGCGTTC AAAATCTTGC GGGTTCTGTT 11940 
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AAAATGAGTT TAAATATTGA TGATTAGATG 
CATGCTTCGT AAATGATTCT GCATCCTCAA 
GTTTCTTCAT TTCTTTTACG ATAGGTAGTC 
ATGACTCTAA CATCAGTCGC AAATCATACA 
ACACATGTGC AC C CATTCTT TCTAATTGGA 
CATCTCGAAT TGGCGAACGA CTCACATTAA 
ACGTACCTTC AGCTATGTGA CCATTCACAA 
CTCCAGTTGT CATACCTTCC AACCATTTCT 
CTTCATTACA CGACATACTT GTATACAAGT 
CGCTTTCTTT ACGAG CACT A AAATAGTGAC 
TGGTCACTAT CACACGAATG ATTTAATTGT 
CTGTACTCAT ATGCGCTTTA GGTACTTCAA 
ATTCGAATCG ACGTGTTGTC GCTGTATGTT 
AGAATATATG CTGGCAGTTA GGACATAAAT 
CACCAATGCC AT AAC CAATC ATAAATGGTA 
TCAAAATTGC ACTTATAATG CTAGAATATT 
AAATGTTACG ACGAATACTT TTCATTTCAG 
TTAAGTGTGT GATTGGAGAC GTCGACGCTT 
TGTTTAATAC CGCTTGTTGC TGTTTAACTT 
GTAAAGTATT GAGCGTCTTC AAAGTACCTT 
ACGC&CAACC TAAATCTTTA AGCAATAAGA 
TATACACACG: ACGCTTTCCT TCTGTAAATC 
AATATTGAST^ CGTTCGTGTT GTCACATTGC 
TAGACATAGA TTCCACCTCC T ATAATT AC C 
ACAATTTCCA CATTTTAAAG AAATTTATTA 
CCATGTTGAT TTACAAACTC ACTCAAACTA 
TATTTCAGAA TGAATTTGTT GTAATTTATC 
TTTCAGTAAA TCTCGATACT TTTCCGTCTC 
TAAACTTGCC CACATATCCA TACCTATCGT 



CTTTGATTAA TGTTTCATGA AATTCAAAGT 12060 

ATTTTACTGC CACTTTCATC ATTTCAAGTT 12120 

GCTCTTGATT TTTAACTCTT GAAAATGCAA 1218 0 

TTTCTTTCTT TTCTTGTTCC CCAAACGGCA 12240 

TGAGTTGATT TTGTTGCAAT AATTTAAATG 1230 0 

ATTGCTTTGC CATTTGATTT TCAGTGAGTA 123 60 

TGCCTAAGCG TAATTCTGCC GCGATACCTT 12420 

CTGGATATCC ATACATCATC AAAGTCACT C 12480 

ATGTTAATAT AGTTATTATG AGTTTGCAAG 12S40 

CACCCCTTTT CGATTTAAAT TTAAAGGAAA 12600 

TATGTTGTAT GTGGGATATT TCTAATTGTT 12660 

TGCAATAATG CGTTTCATGA CAGTTTGGAC 12720 

TCGCTTTGAT AACTGCCCAC AAAGATGGTG i2780 

AGGCAACCTT TTGTTGGTAA TAAAAAGTAA 1234 0 

AAGCAATTAA AAACGGCCAT TTATTTTTCA 12900 

GAATTATT CC TATAATACCA GCACTAATCC 12960 

CTGATTTACT CATGACATGC TCTATGTCTT 13020 

CATTTACGTA ATATTGAACA TTTTTAATTT 13 08 0 

GTTGGTTAAT TTCTTGTTGT TTCATAGTTA 1314 0 

CACCTTTTAG CAACATATCT ATATCGCTTA 13200 

TTAACTCTAA TGTTTGTCGC TGTTGTTCTG 13260 

CTTGTGGTTT CAAAATACCT TTGCGATCAT 13320 

ATAATTTTGC GAGTTCTCCA GTCGAATAGT 13380 

ATAGTTGATG ACCCGACGTC ACGAGGAAGT 13440 

TACTAGGCGT CTTATTTTTA TGATTTCGTA 13500 

AGTAACACAC CTACTAAACA TCTACTCTGT 13560 

TTCAACTTCA GTAATCTCTG TCGCACATTC 13620 

TGCATTGTTT TTATAACGTA TTTTATGTTC 13680 

TCTAATTTGA ATTTCAACAG GCAATACCTC 1374 0 
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(2) INFORMATION FOR SEQ ID NO: 55: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 1059 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 55: 

GGATAAGTTC AGGTAAATTC ATTTCTTTTT CAATTTTGAT TTTCATTGTT TCCGCCC TTT 60 

15 TAAAATAAAG TTAGTTGCTT CTGTTCCTCA TATTCCAAAT CACTTTGCTT TATATATGTT 120 

TCAAGCTCTT CCGCTGTATC AAATGTCTTT TTCACACCTT GCCAACCTGG CACGATATGA 180 

CCGTGAAAGT AATAAGTGCC ATTTACTACA TGGATATGTG CCACTCGTTC GTTATCCTGA 24 0 

20 TACAGATATC TCTTAGATCC AAAGAATTGA TTTAGGTATT CTTTACGCGC GCTATCTGTC 300 

ATGGTCATCA CTCCTTTTAA CAATTAGGCA GACCAAACGA CATGCATTCG TCGTATAGCT 3 60 

CTTCATTACT TATGCTTGCC TTATAGTTTT CAATCACATT GCTAACTTCT TTATGACTCA 4 20 

25 TTGCTTTAAC TTGTTCGTCT GTATATTTTT CGCAGTCTTC TAATTC CAGT TGCTCCTGTA 4 80 

ATGACATCAC AT ATT CAACT TGTCTTTGGG TTGCCATCGT TAACCCTCCC ACAAGTCAAA 54 0 

AGCTCTTTGG ACGTAAAACT TCGCCTTTGC TAAATCCTCA TGACCATTCT TTAACGGTGC 600 

30 

TCTAGACATG TATTTGATTG CATTACCTAT TGCGAATGCT AGTTGAGGTG GAT ACTGTG C 660 

CGTAACCTGT TCGATAAAAT CTATAATTTC AATGTCGCCG TATGTGTAGT GCGCTGGTTG 720 

CTTAACATTG TCTTGCGCTT CGTTCATATC TACTTTTCTG TTACTGATTA CGCTCATTAT 780^ 

35 

GCTTCACTCC ATTTCTTGAA CATTTGGTTA TAAGTGACAT CGAACCAGTA CGGATCACGT 840 

GAATSTTTTT GTGGCGTTCC ATCATAAAGC CATGGTCTTA ATCTTCTCTT TCTTTCCTGT 900 

TCATATTCCG CTCTCACATT TCGTTGGTAT CGGTTCAAAA TCGCTTTTTT TCTGATTTTT 960 

40 

TCTCTCCCTT TTTCTTCATC TTTnATtTGA CTCTnCATAT ATTCAACTTC TTCTGTAGAT 1020 

nTTGAGTCCT TTCTTCCACA CAATAATTCA nCGCCGCGC 1059 
45 (2) INFORMATION FOR SEQ ID NO: 56: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30246 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
50 (D) TOPOLOGY: linear 
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GAAGTAAAAG AAGAATTAAA TTTAACATTA ACAATGGATG AAATTGAATA TGTCGGGACA 60 

ATTGTAGGTC CTGCATATCC ACAACAGGAT ATGTTAACTG AGTTAAATGG ATTTCGCGCA 120 

5 

TTAACCAAAA TCGATTGGGA AAACGTAACT ATCAATAATG AAATTACGGA TATACGCTGG 180 

ATTGATAAAG ATAATGATGC GTTGATTGCG CCTGCTGTCA AAGTTTGGAT TGAAACTTAT 240 

GGTGGTAAAC ATGACAAATA ATGACACCAT CATGTTACGA CATTATGTCC CACAAGATTA 3 00 

10 

TTCGATGTTA GAAGCTTTTC AATTAAGTGA AAGTGATTTG AAGTTTGTTA AAACGCCAGA 3 60 

GGAAAATATT ACAGCTGCAA TGTCTGATAA TGAAAGGTAT CCCATCGTTG TAATGGATGG 420 

15 CAGGCAATGT GTGGCCTTTT TTACATTACA TCGTGGAAAA GGGGTCGCAC CATTTAGCGA 4 80 

TAACCAAGAT GCAGTATTTT TCAGGTCATT TAGTGTTGAT CAACGTTATC GTAATAGAGG 540 

AATAGGTAAA GTGGTAATGG AAAAATTGGC GTCATTTATC ACTTCAACAT TTCAGGATAT 600 

20 TAATGAGATT GTGTTAACGG TTAATACTGA CAATC CACAT GCCATGGCAC TTTATCGCCA 660 

ACAAGGATAT CAATATATGG GAGATAGTAT GTTCGTCGGA AGACCTGTTC ATATTATGGC 720 

GTTAACTATA AAATAAATTA AATTTAAAAG CATCTTTACT CATCGTCGAC CACAACAATT 780 

25 AATGATGAAT AAAGGTGCTT TTTGTTATAG ATCATCGGAC AATTTACTAT AGTAAAAAGC 84 0 

GACCTAGTGA ACAATTGACA TATATCCACA GGTCGCTTAA CTTAAGTTAT ATTGCTAGTT 900 

GCGATTAATT GAT AGACT GA TCATTTTTGC GCTGTCGAGA TGGTCTTTTT ATTAAAAATG 960 

30 

CCGTAATCCA AGCCGTAATC GGAATACTGA TTGCAACGGC AATACCGCCT AAAATAATAG 1020 

AAATAAATTC TTGGGCAAAT ATTTTCGAGT TTATAATATG ACCAAATGAA TATTTAAGTT 1080 

TGAAAAACCA AATAAATAAA GCAAGTTGGC CACCAAAAAA GGCAAGGTAA ATCGTGTTCG 1140 

35 

CAGATGTCGC TAAAATTTCT CTACCAACAC GCATGCCAGA TTGGAATAAT TCGTATTGCG 1200 

TAACBTTgGA TTCACTTGAT GCAATTCATA AATGGGTGAA CTAATGGTAA TTGTTAAATC 1260 

40 TATCACAGCT GCAATAACAG CAAGAATAAT AGTGAACACC ATAAATTGAA CCATATCAAT 1320 

GCCAATATTC* ATTGAATACA CATATGTTTC ATCTTGTTGT TCGGTTGaAA AGCCTTGTAG 1380 

ATGACCGAAG TAGACCGATA AATAAATGAG TGTAATCAAC AATATTGTTG TAACGATAgT 1440 

45 GCtGgATAAA TGCaGCTTGT GTTTTAACAT TGTAACTATT GAGTACGAAT AAATTACAAG 1500 

CGCCAATAAT AATGCAGAAA AAGAATGTGA CGACATAAAT CGGTACGCCA AAAATAATCA 1560 

ATACAATACT AATAATTAAA ATAGCGAAAT TTAAAAATAG GGTTAAATAA GAGATGAATC 1620 

50 CCTTTTTACC TCCGAAAATT ATCATCAGAA AGAGGAGCAA TAACGCCAAT ATAAATACAG 1680 

CATTCATTGT TTCGCCCTCC TTAATGTTTC AAATATTTCC ATAAACAATA TTGTGATAGG 1740 
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CATCGAAATA GTATAAGTCA CTGTATTGGC ATTTTTTAAA AAGATTAAAA ACATAGGTAG i860 

TGCACCGGAT AAATATGAGA ATAATAAGAT GTTAGTCATT GTTC CCATAA TATCTTGGCC 1920 

GATGTTTCGC CCAGCAAGCG CCCATCTCCT CATTGAAATG TGTGGCGTAC GCTGTAAAAT 1980 

TTCATG CAT A CCACTAGCAA TTGTAATTGC AACATCCATA ATAGCGCCAA GTGAACCTAT 204 0 

TAACACTGAG GCTAGGAAGA TATCTTTCGG TGGTAATGAT AAAAAGTTCA TCGTTTCATA 2100 

TTTAATGCCT TTACCATCTG T CAT AT AT AT GATTAATTCT GTTAAACCTA TACTCAAAAA 2160 

AGTTCCGATA ATTGTACTGG CTATGGTAAT GAGTGTACGC ATATGCCAGC CTGTAACGAG 2220 

CAATAAAGTG AGTATTGTTG AACAGATCAT GGCAATGGTC ATGAGTAAGA ATAAATTAAT 2280 

ATTGCTATGT TGAATATGAA TGTAAATTGC GATTAATATG GCAATAGAAT TCAAGATTAA 234 0 

CGATAAAATC GATTGCAGTC CGACTTTGCG ACCAACCAAT AATACAGTTA ATAAGAACAA 24 00 

ACCAGTGATG ATAACCGTTA AGGTATCACG CTTCTTTTCT ATAATATAAG CATCACTCGG 24 60 

CTTGTTAGAA ATATGTAATA ATACTTTTTC GTGTGTGOGA AATGCCTCAG AATCTGCTTG 2520 

CGATTTGACG TACTGATGAT TAATCGTCGT CGTTTCTCCA GCAAATTGAC CATTTAATAT 2580 

25 TTTGACTTTT AATTGATTTT TATATTTAAT AT CACGATTA TTTTGTGCAT CTTTTGTAGG 2640 

TGTCGAAGAA ACATGTTTGA CATCTATAAT TTGACCAATT GGTTTGTTGT AAAAGTTCTC 2700 

ATTATTGAAT GTAAATAAAA TAGCACCAAT GAATGCGATG CAGAACAAAC CTAAAATTAT 2760 

30 ATTAAATGGC TTTGTAAATA AATTT CTAT A TTTCAAAAAC AAAACCCCAA TTCTATGAAT 2820 

GAATTAATAT GGTGATTATA CGCCCTTAAT TTTTTATTTT CAAAGATATT ACTGCTAAGT 2880 

GTAAAACGAA AATCATCATT GATAGCATCG AATTACTTAA TGGAATGTAG ACGTTTTAGT 2940 

CATTAATTGC TGAATAAGTG TTAATAATAT GCCAATATCA CTCTTTGTAT AAGGCTCCTT 3000 

TGTJyVTAGCA CATATCGTTC TTTTTAATTC AGTATGATCT AATTTTATAT CTATCCATGA 3060 

TTTAGATTCT GGTAAATGTA TATTTTGTGA TGAAATGATG TAACCTTCTT TTTGACGAAG 3120 

GAGATAcTGC GCAAGTGGTT GGCTACTGAT TGTGTATACA TCTGATTTAG TAATCTTGCG 3180 

CAATTGTTTT TTTACAGTTT CGGCAAATGG TGCCAAGCAA TAAATATGAC TATGCTCAAA 3240 

CTGAATTAAT GGTGGGTGTG TCGCCATCGT AATTGGATCG TCTGAAGGCG CATATAAATG 3300 

ATAGTGCTCT TCGAATAAAG GTAGCATATG TAATTGTTTG TGTTTACGTA TTTCTGGTGT 3360 

AAGTTCCGTG AAACCAATGT CTATATTCCC ATTTAATACG CTATTTATAA TTGTGTCATG 34 20 

50 TTCTAATAAG CTCGGTATGA CATGTGTATC ATTTTGTAAA TGAAACGTTT GGATAAGTGG 34 80 

TAGTAACATG TGGGATACGT CACTCTCATC ATAGCCAATG TAGATACTTT TATTTTTAGT 3540 
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TTCATTAAAT AATAATTTCC 
TAAAGACACA TTAAGTTCTT 
AATGTTTAGT TCAAGTGCTG 
TCTTAATTGT TTAATTTCCA 
ATTATAGCAA TATTATTGAT 
ATTGAACAGA TAAATTTTTT 
TCTAAAAAAG GGGTGTGCAT 
CGTGAGGAAC ATGATGCGTG 
CACGACATCA TTGATAAATC 
GGCGCAGATG GCATCACTGG 
TTCAAACAAC ATGTAACGGA 
TTTTTTTCCA AAGAACGCAT 
GAAGGCGAAG GGTTATCAAT 
GCTAAACATG TAGCAGATAC 
ATTGAAGATG TTGAAAAGCG 
CAGTGCGATT TAGAATTGTA 
TGGTTACGAT CAGACCAAAT 
TCAAAGCTAG GGTTAGTGCA 
GCACATCCTA ACCGTATGTT 
AACTGGATGC GAGCACGCCA 
AAAGTGTTTC AAATTGTCGA 
GAGTTCTXAT CGTTAGCCAT 
TGGTTATATA ATGAAGCGAA 
TTAATGGAAC CGTGGGATGG 
GCGCTTACAG AT AGAAATGG 
ATTGTCTTTT CATCTGAAGT 
GGTCAATTGA ATCCTGGAAA 
AATAATGATT TAAAAGGTGC 
CATAAAGTTG ACTTTGATTT 



CTTCAGATGT GAGCGTAATA TTGCGTCCTT GCTTTTTAAA 3660 

GTTCTAATAA TGTAATTTGA CGGCTTATCG CTGATTGAGC 3720 

TTTCGGAGAT ATGTTCTCTT TTAGCGACCT CGATAAAATA 3780 

TAGCGATATA GGCACCTCCA AAAATGAGTG TTTTGTAACT 3 840 

AAATGTTCTA TTTTTTAGAT GAATATCTTC TATTTTATAT 3 900 

AGATTATAGT AATTATCATT AATAACTAAT ATCAGAATAT 3 960 

CATGCACAAT GAGAAATTAA TTAAAGGCTT ATATGACTAT 4 020 

TGGTATTGGT TTTTATGCGA ATATGGATAA TAAAAGGTCT 4080 

GCTTGAAATG TTGCGACGCT TAG AT CACAG GGGCGGGGTC 4140 

TGATGGCGCA GGTATTATGA CTGAAATACC TTTTGCATTT 4200 

CTTTGATATC CCAGGTGAAG GTGAATATGC CGTGGGGTTA 4260 

TTTAGGTTCT GAACATGAAG TAGTTTTTAA AAAATATTTT 4 320 

TCTTGGTTAT CGTAATGTAC CAGTTAATAA AGATGCCATT 4380 

GATGCCAGTC ATTCAACAAG TGTTTATTGA TATTAGGGAC 4440 

TTTGTTTTTA GCGAGAAAAC AATTAGAGTT CTATTCGACT 4500 

TTTTACGAGC TTATCACGCA AAACAATTGT AT AT AAAGGT 4560 

TAAAAAACTA TATACAGATT TATCGGATGA TTTATATCAA 4620 

TTCGAGATTT AGTACGAATA CATTCCCGAG TTGGAAAAGG 4680 

AATGCATAAT GGTGAGATTA ACACGATTAA AGGTAATGTA 4740 

ACATAAATTA ATCGAAACAT TATTTGGCGA GGATCAACAT 4800 

TGAGGATGGT AGTGACTCTG C CATTGT AG A TAATGCGCTA 4 860 

GGAGCCAGAA AAGGCAGCGA TGTTACTCAT ACCTGAACCT 4 920 

TGATGCAAAT GTACGTGCGT TTTATGAATT TTATAGTTAT 4980 

TCCTACAATG ATTTCGTTCT GTAACGGTGA CAAACTTGGC 5040 

ATTACGTCCA GGTCGTTATA CGATTACTAA AGATAACTTT 5100 

GGGTGTTGTG GACGTACCTG AAAGTAATGT TGCTTTTAAA 5160 

GTTATTGCTT GTTGATTTTA AACAGAATAA AGTCATTGAA 5220 

GATTGCTGGA GAATTACCAT ATAAAGCGTG GATTGATAAC 5280 

TGAAAATATA CAATATCAAG ATTCGCAATG GAAAGATGAG 5340 
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CAGGAACTTG 


TAGAAGGTAA 


GAAGGATCCT 


ATCGGTGCAA 


TGGGATATGA 


TGCGCCAATT 


5460 


5 


GCAGTGTTGA 


ACGAGCGACC 


AGAATCACTA 


TTTAATTACT 


TTAAACAGCT 


GTTTGCACAA 


5520 




GTTACGAATC 


CACCAATTGA 


TGCGTATCGT 


GAAAAAATCG 


TAACGAGTGA 


ACTTTCTTAT 


5580 




TTAGGTGGCG 


AAGGTAACTT 


ACTAGCACCT 


GACGAAACGG 


TTTTAGATCG 


TATTCAATTG 


5640 


10 


AAAAGGCCGG 


TATTGAATGA 


ATCACACTTA 


GCAGCGATTG 


ATCAGGAACA 


TTTTAAATTA 


5700 




ACTTATTTAT 


CAACGGTATA 


TGAAGGGGAT 


TTGGAAGATG 


CGTTAGAAGC 


ATTAGGCCGA 


5760 




GAAGCAGTGA 


ATGCTGTAAA 


GCAAGGCGCT 


CAAATTCTAG 


TGTTAGATGA 


TAGTGGATTA 


5820 


IS 


GTTGATAGCA 


ATGGCTTTGC 


AATGCCGATG 


TTACTCGCAA 


TAAGTCATGT 


GCATCAATTA 


5880 




CTTATTAAAG 


CAGATTTACG 


TATGTCTACA 


AGTTTAGTCG 


CTAAATCTGG 


TGAGACACGA 


5940 




GAAGTGCATC 


ATGTTGCTTG 


TTTACTCGCA 


TATGGCGCGA 


ATGCAATTGT 


GCCATACCTA 


6000 


20 


GCGCAACGTA 


CAGTTGAACA 


ACTGACATTG 


ACAGAAGGGT 


TACAAGGCAC 


CGTTGTCGAT 


6060 




AATGTTAAGA 


CATATACGGA 


TGTATTGTCA 


GAAGGTGTCA 


TTAAAGTAAT 


GGCTAAGATG 


6120 




GGAATTTCGA 


CAGTGCAAAG 


TTATCAAGGG 


GCACAAATAT 


TTGAAGCGAT 


TGGCTTGTCT 


6180 - 


25 


CATGATGTGA 


TTGATCGTTA 


TTTTACTGGG 


ACACAGTCTA 


AGTTATCTGG 


TATTTCGATT 


6240 r 




GATCAAATTG 


ATGCTGAAAA 


TAAAGCACGT 


CAACAAAGTG 


ATGATAATTA 


TCTTGCATCA 


6300 * 


30 


GGTAGTACAT 


TCCAATGGAG 


ACAACAAGGT 


CAACATCATG 


crrri'AATCc 


GGAATCTATT 


63 60 — 


TTCTTATTGC 


AGCACGCATG 


TAAAGAAAAT 


GACTATGCGC 


AATTTAAAGC 


ATACTCTGAA 


6420 4 




GCGGTGAACA 


AAAATAGAAC 


AGATCACATT 


AGACATTTAC 


TTGAATTTAA 


AGCATGTACA 


6480 - 


35 


CCGATTGACA 


TCGACCAAGT 


TGAACCGGTA 


AGTGACATTG 


TCAAACGCTT 


TAATACAGGG 


654 0 


GCGATGAGTT 


ATGGATCGAT 


TTCAGCGGAA 


G CACATG AAA 


CGTTAGCACA 


AGCCATGAAC 


6600 




CAATTAGGTG 


GAAAGAGTAA 


TAGTGGTGAA 


GGTGGCGAAG 


ATGCAAAACG 


TTATGAAGTA 


6660 


40 


CAAGTTGATG 


GAAGCAACAA 


AGTAAGTGCG 


ATTAAACAAG 


TTGCTTCTGG 


GCGTTTTGGT 


6720 




GTAACTAGTG 


ATTATTTACA 


ACATGCCAAA 


GAAATTCAAA 


TTAAAGTTGC 


GCAAGGTGCA 


6780 




AAGCCTGGTG 


AAGGTGGTCA 


ATTACCTGGT 


ACTAAGGTAT 


ATCCGTGGAT 


TGCGAAGACA 


6840 


45 


AGAGGGTCAA 


CGCCAGGTAT 


CGGTCTGATT 


TCACCACCGC 


CACATCATGA 


TATTTATTCA 


6900 




ATAGAAGATT 


TAGCGCAACT 


GATACATGAT 


TTGAAAAATG 


CG AATAAAGA 


TGCAGATATC 


6960 




GCGGTAAAAT 


TAGTTTCGAA 


AACAGGTGTT 


GGTACCATTG 


CATCTGGGGT 


GGCAAAAGCA 


7020 


SO 


TTTG CAG AT A 


AAATTGTCAT 


CAGTGGTTAC 


GATGGTGGTA 


CAGGGGCTTC 


ACCCAAAACG 


7080 




AGTATTCAGC 


ATGCCGGTGT 


TCCTTGGGAG 


ATTGGTTTAG 


CAGAAACACA 


TCAAACATTA 


7140 
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AAAGATGTAG CGTACGCATG TGCGCTTGGA GCGGAAGAAT TTGGATTTGC AACTGCACCA 7260 

TTAGTGGTGT TGGGCTGTAT TATGATGCGT GTATGCCATA AAGATACATG TCCAGTAGGA 7320 

GTTGCAACTC AAAACAAAGA TTTACGTGCT TTATATAGAG GTAAAGCACA TCATGTTGTT 7380 

AATTTTATGC ATTTTATTGC ACAAGAATTA AGAGAAATTT TAGCATCTTT AGGTTTGAAA 7440 

CGTGTAGAAG ACTTAGTTGG AAGAACTGAT TTATTACAAC GATCATCAAC ATTAAAAGCG 7500 

AATAGCAAAG CGGCTAGTAT TGATGTTGAA AAACTGTTAT GTCCTTTCGA TGGGCCAAAC 7560 

ACAAAAGAAA TTCAACAAAA TCATAATCTT GAGCATGGAT TTGATTTAAC AAATTTATAT 7620 

GAAGTAACGA AGCCATATAT TGCTGAAGGG CGTCGCTATA CAGGTAGCTT TACAGTAAAT 76 80 

AATGAACAAC GTGATGTAGG GGTTATTACA GGTAGTGAGA TTTCGAAACA ATATGGAGAA 7740 

GCAGGACTTC CTGAAAATAC AATTAATGTT TATACGAATG GTCATGCTGG TCAAAGTCTT 7800 

GCAGCATATG CACCGAAAGG CTTAATGATT CATCATACTG GAGATGCGAA TGACTATGTT 7860 

GGTAAAGGAT TATCTGGTGG TACGGTCATT GTCAAAGCAC CTTTTGAAGA ACGACAAAAT 7920 

GAAATTATTG CTGGTAACGT CTCATTCTAT GGTGCGACAA GTGGTAAGGC ATTTATTAAC 7980 

GGTAGTGCAG GAGAAAGATT CTGTATTAGA AATAGTGGTG TAGATGTTGT CGTTGAAGGT 8040 

ATCGGCGACC ATGGATTAGA GTATATGACT GGTGGACATG TCATTAATTT AGGTGATGTA 8100 

GGTAAGAACT TCGGTCAAGG TATGAGTGGT GGTATTGCTT ACGTTATCCC GTCTGATGTA 8160 

GAAGCTTTTG TTGAAAATAA TCAACTAGAT ACGCTTTCGT TTACAAAGAT TAAACACCAA 8220 

GAAGAAAAAG CATTCATTAA GCAAATGCTG GAAGAACATG TGTCACACAC GAATAGTACG 8280 

AGAGCGATTC ATGTGTTAAA ACATTTTGAT CGCATTGAAG ATGTCGTCGT TAAAGTTATT 8340 

CCTAAAGATT ATCAATTAAT GATGCAAAAA ATTCATTTGC ACAAATCATT ACATGACAAT 8400 

GAAGATGAAG CGATGTTAGC TGCATTTTAC GATGACAGTA AAACAATCGA TGCTAAACAT 8460 

AAACCAGCCG TTGTGTATTA AGGAAAGGGG GAGATACGAT GGGTGAATTT AAAGGATTTA 8520 

TGAAGTATGA; CAAACAGTAC TTAGGTGAAT TATCACTGGT AGACCGTTTG AAGCATCATA 8580 

AAGCATATCA- ACAACGATTT ACTAAAGAAG ATGCCTCTAT CCAAGGTGCA CGATGTATGG 8640 

ATTGTGGAAC GCCGTTTTGT CAAACCGGAC AACAGTATGG TAGGGAAACA ATAGGTTGTC 8700 

CAATTGGAAA CTACATTCCT GAATGGAACG ACTTAGTGTA TCATCAAGAT TTTAAAACTG 8760 

CTTATGAACG CTTAAGCGAA ACAAATAACT TTCCTGACTT TACAGGGCGT GTATGTCCTG 8820 

CACCATGCGA AAGTGCTTGT GTGATGAAGA TTAATAGAGA ATCGATTGCG ATTAAAGGTA 8880 

TTGAACGCAC AATTATTGAT GAAGCTTTTG AAAATGGTTG GGTAGCGCCG AAAGTTCCGA 8940 
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CTGAAGAACT 
GCGGTTTATT 
GTATTAAGTT 
ATATTGATAA 
AAAAAGGTAG 
ATTATTTAAC 
CAAAAGATAA 
CAGCATTAAG 
AAG CAATTAC 
ACTATGCGCA 
AAACAATGCG 
TAGAGCAAGG 
AC CTTGT ATT 
ACATTAAAAC 
AAAAGGTATT 
AAGAAGGTAG 
AATCTTTGTA 
AAATCTAGTA 
GACAACAAAT 
GTGGAAGACA 
AACGTTCATA 
CGGTCTTGAA 
CCATCAATAT 
TTACCTTTTT 
TTGAGATTGG 
GTGAAATATG 
ATTTAGGAGT 
AATAACG CAA 
TGAAATGGCT 



TAATCTACTA 
AATGTATGGT 
AATGGAAGAA 
AGCAACGTTA 
AGATTTACCT 
TGAACAAACG 
GAATGTCATT 
AGAAAATTGT 
ATTTACAGAA 
CCAAGAGTAC 
TTACGATGTT 
CGAAAATGGT 
ATTATCAATC 
GGATAGAAAT 
TGCTGCTGGA 
AGGCGTAGCG 
TGGAAATGGT 
TCTATCAACG 
AGATTAATTA 
ATGATTTGTG 
TATGATAAAT 
AACCGACAGG 
TTATATTAAA 
TATTTGTCTT 
AAGGGCATTT 
ATGGATGGCT 
TGGCCATGCA 
TTGTAGCGAG 
ACCCCAGATA 



GGATATCAAG 
ATTCCGAATA 
GCGGGCATTA 
GAATCTGAGT 
TTAGAAGGAC 
CAGTTGTTAA 
ATCATTGGTG 
AAATCGATTG 
AATGCATCAT 
GAAGCTAAGT 
GACGATAAAG 
ATGGTCATGA 
GGCTTCGAAG 
CGAATCGTGG 
GATGCTAGAC 
AAAG CAGT AG 
GGTTACGTTG 
TCACATGCCA 
TAAGTAGTGA 
GTAATCATGT 
ATTGTGTTTA 
GGCTTAACGG 
TTCTATATAT 
TGAATGGCTC 
GGCTTGTGCA 
TGTGTGGACA 
TCTACACTTT 
GAGTTATTGC 
ATTGTGACAA 



TAACTATTTA 
TGAAACTTGA 
CTTTCATTAA 
ATGATGCCAT 
GCATGGGTGA 
ATGGAGAAAT 
CTGGTGATAC 
TTCAATTTAA 
GGCCTTTAGC 
TTGGTAAGGA 
GACACATACG 
AAGAAGGACC 
GTACAGAACC 
CGGATGATAC 
GTGGTCAAAG 
ATCAGTATTT 
ACGTTGTGAC 
TCTTTGTAAC 
TTTTTTACAT 
AATGCTTAAA 
GGAGGAATAC 
CTCGCGGGGG 
AATGAAGGTA 
GTAATTTTTG 
ATATACATAG 
AGTTTGCTAT 
ATAATGGTGA 
TACATATGTC 
AATAAAAATA 



TGAACGTGCT 
TAAAGATGTG 
TGGTGTTGAA 
TATATTATGT 
TGGTATACAT 
TGATGATATA 
AGGGGCAGAC 
TAAATATACG 
AATGCCGGTG 
ACCACGTGCA 
TGGTTTGTAT 
TGAAAGATTT 
AACAGTACCG 
AAACTATCAA 
TTTAGTTGTA 
AGCTAGTAAA 
ATGCTGAATC 
CTAAAAACAA 
TCGTTTATAG 
AACAATATTG 
CCAAGTCCGG 
TTCGAATCCC 
AGTGCTCAAA 
ATAATAGAAA 
CTAAATGTCT 
TTATAGATAT 
GAGCGTGGTG 
GTTATGGCTC 
TTTTGTTGAA 



AGAGAATCAG 
GTTCGACGTC 
GTCGGTGTTG 
ACTGGTGCAC 
TTCGCTATGG 
ACAATAACTG 
TGTGTAGCGA 
AAATTGCCAG 
TTTAAAATGG 
TATGGTGTTC 
ACTCAAATTT 
TGGCCTGCTG 
AATGCTTTTA 
ACTAATAATG 
TGGGCAATTA 
GTTTGTGTAT 
GAGTTTGAAA 
AGGTTTGTAA 
GTCAACTGTA 
ACTTTTACAG 
CTGAAGGGAT 
TCTTCCTCCG 
TTTTGAGTAT 
TGATAAGGCA 
TTTTTGTTTT 
GCATTTTTCA 
AGGTATTGTT 
ATTGATTTTC 
AGCCTTTACA 



9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 



55 



NSOOOD: <£P 078651 9 A2_l_> 



427 



EP 0 786 519 A2 



10 



75 



20 



25 



30 



35 



40 



45 



SO 



TAAAAAGAGA AGATGTAAAA 
CTGCAACGCA TTGTGTAACA 
AAGACGCATT AAGTAATAAC 
AAATTGTCAT TGGTCCAGGA 
GTGCTCAAGA AGCTTCGAAA 
TACAACGTTT GATCAAATTG 
CAGCTGGTTT GTTAATGGGA 
AAGCACTTAT TGAGATGTAT 
CGAGTACGGC ATTTATTTTC 
GTGGTAGTCC GATTCTAGGC 
CTCAGTATGA TTTGGCAAAA 
AGCAGTTGAA TTACCAAGGT 
AAATTGAAAA AGGATTAAAT 
CCGTAGCGCT TTTAGTTACT 
TGaTTGGTAC AGGTATTACA 
GCGGAGCAAT ATATGGATTG 
TTTTAGCAGT AG ATTTC CAA 
TTGTTGCGAT TTCCAATATT 
AACGTCGTAA AATGGTTAAA 
TAGGTGTTAC TGAACCAGCC 
CTGCGATATC AACGTCTTGT 
AAGTTGGTGT TGGTGGCGTG 
ATCTTATTGT GACAGCTATT 
ATTTTAG r EAA ACAAAAAGCG 
TCGTTATTTG GACGTCCTTT 
ATTGGAGAAA ATCCGTTGTA 
ATGGTATAGG AGATATCAAT 
TTGATTATAT TTGGTTAACA 
TCAGCAATTA TTTAGAAATC 



GCCATCGTAA CCGCTATTGG GGGAAAAGAA AATCTTGAAG 10860 

CGATTACGTT TAGTGCTGAA GGATGAAAGT AAAGTTGATA 10920 

GCGTTGGTCA AGGGGCAGTT TAAAGCAGAC CATCAATATC 10980 

ACAGTCGATG AAGTGTATAA GCAGTTTATT GATGAAACAG 11040 

GATGAAGCGA AACAAGCAGC TGCACAAAAA GGGAATCCAG 11100 

T t GGGGGATA TTTTTATACC AATATTACCT GCGATTGTGA 11160 

AT CAATAATT TACTTACAAT GAAAGGTTTA TTTGGTCCAA 11220 

CCACAAATTG CTGATATTTC AAACATCATT AATGTGATTG 11280 

TTACCAGCAT TAATTGGTTG GAGTAGTATG CGTGTATTTG 11340 

ATAGTCTTAG GTTTGATTTT AATGCATCCG CAATTAGTAT 11400 

GGGAATATTC CGACGTGGAA CTTATTTGGC TTAGAGATTA 114 60 

CAAGTGTTGC CAGTtTTAAT TGCAGCTTAC GTTCTAGCTA 11520 

AAAGTCGTTC ACGATTCGAT AAAAATGTTG GTCGTTGGAC 11580 

GGATTTTTAG CATTTATTAT CATTGGACCA GTTGCGTTAT 1164 0 

TCTGGTGTTA CATTTATATT CCAACATGCA GGATGGCTTG 11700 

TTATATGCAC CACTTGTAAT TACAGGACTA CACCATATGT 11760 

TTGATGGGTA GCAGCTTAGG CGGTACGTAT TTATGGCCAA 11820 

TGTCAGGGCT CTGCAGCATT- TGGAGCATGG TTTGTCTATA 118 80 

GAAGAAGGCT TGGCATTAAC ATCTTGTATT TCTGGTATGT 1194 0 

ATGTTCGGTG TGAACTTACC TCTGAAATAT CCATTTATCG 12000 

GTATTGGGGG CAATCGTTGG TATGAATAAC GTACTTGGAA 12060 

CCAGCATTCA TTTCAATTCA AAAAGAATTT TGGCCAGTAT 12120 

GCTATTGTTG TACCATGTAT ACTAACAATT GTGATGTCTC 12180 

AAAGAAATTG TTGAAGATTA ATAAAATAAA AAAGGGGCGT 1224 0 

ATTACGTTAT AAGGTGGTAA TTGTGTGTCG AAAGAAATAG 12300 

TATCAAATTT ATCCTAAGTC GTTTAATGAT ACGACGGGGA 12360 

GG AATTATAG AAAAATTGGA TTATATCAAG TTATTGGGTG 124 20 

CCAGTGTATG AATCACCGAT GAATGATAAT GGCTATGATA 124 80 

aATGAAGACT TTGGAACGAT GGATGATTTT GaAAAGTTAA 12540 
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CGACGGAGCA TGaATGGTTT AAAGAAGCCC 
ATTACTTTTT CAG AT CATCT GAAGACGGGC 

5 GTAATGCATG GAAGTATGAT TCTGAGACAG 

GTCAAGCTGA TTTAAATTGG GATAATCCGG 
ATCATTGGAT AGACTTCGGC GTTGATGGTT 

10 AAGGTGAATT TAAGGACTCT GACAAAATAG 

TGCATGAGTT TCTGCATGAA TTAAATCGTC 
TAGGAGAAAT GTCTTCGACG ACGATTGAAA 

75 

AAGAATTGAA TAGTGTTTTT AATTTTCATC 
AGTGGACAAA TGCGAgcTTG nATTTTCATA 
GAGGTATTTA TGACGGTGGC GGATGGAACG 

20 

GGGTAGTGTC TAGATTTGGT GATGATACGT 
TGTTAGCTAT CGCACTGCAT ATGTTGCAAG 

2$ TTGGTATGAC GGACCCACAT TTTACATCAA 

ATGCCTACCA T CAGTTGTT A AGTGAAGGGC 
GACAGAAGTC ACGAGACAAT TCGAGAACGC 

30 GATTTACAGC TGGTAAnCCT TGGATTGATA 

GACAAGCACT TCAGAATAAA GAGT CTATTT 
GACATACGCA TGATATTATT ACGTATGGAG 

35 ATTTATTTGT TTATGAACGT CATTATAAGA 

CAGCATCGGC tgttgatttg CCAGAAGGAT 
CAGGCACAGT GGAAAATAAT ACGATAAGCG 

40 CGTAAAATAA ATTGAGTGGA TGCGTTTATA 

ATGAGGCGTT GAAAGAAGAT ATATTAAACG 
CTGAACATGA TTTGGTGCAA TTGTACCAGT 

45 

ATTTGTTGGC ATTAGACGGC ATGATTCAAA 
ATCAGGAGGT TACAGAGTTT CCATTTTCTG 
AAATGGGCGT CGCATATTTA ACTGAAGTTG 

SO 

TTCCAGAAGT TCAACATGCT TTAAACATCA 
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GTAAATCTAA AGATAACCCy TATAGAGATT 12660 

CGCCAACAAA TTGGCATTCT AAATTCGGTG 12720 

ATGAATATTA TTTACATTTA TTTGATGTCA 12780 

AAGTACGTCA ATCGTTATAT CGCATAGTCA 1284 0 

TTCGATTTGA TGTCATTAAC TTAATTTCTA 12900 

GTAAAGAATT TTATACGGAT GGTCCTAGAG 12960 

AAACGTTTGG TAACACTGAC ATGATGACTA 13020 

ATTGTATTAA GTATACACAA CCAGAACGCC 13080 

ATCTAAAGGT TGATTATGTT GATGGTGAAA 1314 0 

AGTTAAAGGA AATTCTGATG CAATGGCAAC 13200 

CGATTTTCTG GTGTAATCAT GATCAGCCAC 13 260 

CGGAAGAGAT GAGGATACAA AGTGCTAAAA 13320 

GGACGCCATA TATTTACCAA GGTGAAGAAA 13 3 80 

TAGCACAATA TCGTGATGTT GAATCGATTA 13440 

ATGCTGAAGC GGATGTGTTA GCGATTTTAG 13 500 

CTATGCAATG GAGTGATGAT GTTAATGCTG 13 560 

TTTCGGAAAA TTATCATCAG GTCAACGTTA 13620 

TCTATACGTA TCAAAAATTA ATACAATTAA 13680 

ACATTGTGCC ACGTTTTATG GATCATGATC 13740 

ATCAACAATG GCTAGTAATT GCGAATTTCT 13800 

TGGCTAGAGA AGGTTGTGTT GTGATTCAAA 13 860 

GGTTTGGTGC AATTGTAATC GAAACAAACG 13 920 

TGGCGAAACA AAAAAAGTTT ATGAAGATTT 13 980 

GGCAGATTCA ATATGGTGAA CAAATTCCGT 14040 

CATCTCGAGA GACCGTGCGT AAGGCATTAG 14100 

AGATTCATGG TAAAGGGTCA CTTGTCATTT 14160 

AACTTGTTAG TTTTAAAGAA ATGCAAGAAG 14220 

TTGTGAATGA GGTTGTTGAA GCGCATGAAG 14280 

ATTCTAGTGA ATCACTCATT CAT A T TG TT A 14340 
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TTGTTTCAGA 


TATAGGTAAT 


GATGTTGCGA 


GTGATTCTAT 


TTATGATTAT 


TTGGAAAAGG 


14460 




TATTAAATCT 


TAATATTAGT 


TATTCAAGTA 


AGTCTATTAC 


TTTTGAACCG 


TTTGATGAAC 


14520 


5 


AAGCATATCA ATTGTTTGGT 


GATGTATCGG 


TGGCTTATTC 


AGCAACAGTT 


CGAAGTATTG 


14530 




TGTATTTAGA AAATACAATG 


CCGTTTCAAT 


ATAATATTTC 


AAAACATCTT 


GCAAATGAAT 


14640 




TTAAATTTAA 


TGACTTCTCA 


AGACGTCGTA 


TAAAGTAAAC 


AATGATATAA 


ATGATTTATA 


14700 


10 


CTTGCAATTA ACTATTAAAA 


TATAGTAATA 


TATATCTTGC 


CGTGCTAGGT 


GGGGAGGTAG 


14760 




CGGTTCCCTG 


TACTCGAAAT 


CCGCTTTATG 


CGAGGCTTAA 


TTCCTTTGTT 


GAGGCCGTAT 


14820 




TTTTGCGAAG 


TCTGCCCAAA 


GCACGTAGTG 


TTTGAAGATT 


TCGGTCCTAT 


GCAATATGAA 


14880 


IS 


CCCATGAACC 


ATGTCAGGTC 


CTGACGGAAG 


CAGCATTAAG 


TGGATCATCA 


TATGTGCCGT 


14940 




AGGgTAGCCG AGATTTAGCT AACGACTTTG 


GTTACGTTCG 


TGAATTACGT 


TCGATGCTTA 


15000 


20 


GGTGCACGGT 


TTTTTATTTT 


TTAAATATTA 


AACCGATTAT 


TAAGAGTTGA 


AAATATATAA 


15060 


TTATAGAAGC 


TACTTTCTTG 


AAGACAATTC 


AGCGTATTAT 


ACGTGGAACA 


TGTTTGTGGG 


15120 




AAGTAGCTTT 


TTTATATGTG 


AAGTTTGATT 


CAAGTGAACT 


CGATGTGCAG 


TTTGAATGAT 


15180 


2S 


TTTTGTGTCA 


ATGAAAAGTA 


AGAAGTTATA 


ATTTGATGAT 


AAAGAAATGA 


TGGTGAAATG 


15240 


AGGGGGAGTA 


TCTTACAATA 


GAATTATTAA 


TGAGATACGT 


TATGATTATT 


GACAATCAAA 


15300 




TGCCTACGGA 


GGACATATGC 


AAATATATTT 


AAGTACTTTA 


ACAGAGTTAG 


ATTATGATAA 


15360 


30 


ATCTTTAAAT 


AGTATTGAAG 


AAAGTTTTGA 


TGATAATCCT 


GAAACGAGTT 


GGCAAGCACG 


15420 




TGCGAAAGTA 


AAACATTTAA 


GAAAATCTCC 


TTGCTATAAT 


TTTGAATTAG 


AAGTAATAGC 


15480 




GAAAAATGAA 


AATAACGATG 


TCGTTGGACA 


CGTTTTATTA 


ATTGAAGTAG 


AAATTAATAG 


15540 


35 


TGATGATAAG 


ACGTATTATG 


GTTTGGCGAT 


TGCCTCTTTA 


TCAGTTCATC 


CTGAATTACG 


15600 




TGGACAAAAA 


TTAGGTCGTG 


GCTTGGTTCA 


AGCAGTAGAA 


GAGCGTGCCA 


AAGCACAAGA 


15660 




GTATAGTACG 


GTTGTTGTAG 


ACCATTGTTT 


TGACTACTTT 


GAAAAGTTGG 


GTTATCAAAA 


15720 


40 


TGCTGCTGAG 


CATGACATTA 


AATTAGAATC 


TGGTGATGCA 


CCGTTACTTG 


TAAAATATTT 


157B0 




ATGGGATAAT TTGACGGATG 


CACCACACGG 


AATCGTAAAA 


TTTCCAGAAC 


ATTTTTATTA 


15840 




ATTGTTCAAT TAAGAAGTAA 


AGGTATTATC 


ATGCTATAAT 


GAGAGGTAAT 


TGTTTATGGA 


15900 


45 


GGTGCTAACT 


TGAATTATCA 


AGCCTTATAT 


CGTATGTACA 


GACCCCAAAG 


TTTCGAGGAT 


15960 




GTCGTCGGAC 


AAGAACATGT 


CACGAAGACA 


TTGCGCAATG 


CGATTTCGAA 


AGAAAAACAG 


16020 


SO 


TCGCATGCTT 


ATATTTTTAG 


TGGTCCGAGA 


GGTACGGGGA 


AAACGAGTAT 


TGCCAAAGTG 


16080 


TTTGcTAAAG 


CAATCAACTG 


TCTAAATAGC 


ACTGATGGAG 


AACCTTGTAA 


TGAATGTCAT 


16140 
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AATAATGGCG 
TCGAAATATA 
GCCCTTTTAA 
GAACCACATA 
ATTAGCCTAG 
TGTGAAGATG 
TTAAGTATTA 
TTGAATGTCA 
CAAGGTGACG 
GTGAATCGCC 
TCTGAGAAAG 
ATGATTGATC 
CATTTTGAAG 
GCGAATGTAG 
CGTATGGAAC 
CCTGTTCAAA 
TCAATGCAAC 
AAAGATCATT 
AGTTTATTGC 
GAGGAAGAGA 
AGTGTTGTAT 
TGGCAAAGAG 
AAGCAACAAG 
ACTGTACATG 
AAAGAAACAT 
GCGGTGGCGG 
CTCAAGAACA 
TTGCAGTTAC 
TAGACCCAGA 



TTGATGAAAT 
AAGTTTATAT 
AGACGTTAGA 
AAATCCCTCC 
ATCAAATTGT 
AAGCCTTGGC 
TGGATCAGGC 
CAGGTAGCGT 
TACAAGCATC 
TAATAAATGa 
ATACTGAGTA 
TTATTAATGA 
TGTTGTTAGT 
CTGAACCAGC 
AGTTAGAGCA 
AATCTTCGAA 
AAATTGCAAA 
GGCAAGAAGT 
AAAATTCGGA 
TCCATTGTGA 
GTAATATCGT 
TTCGAACGGA 
CACAACAAAC 
TGATAGATGA 
CATTTTATTG 
AAACATGCAA 
AGAAAAACTT 
TGTAACTGGT 
CGATATTGAA 



AAGAAATATT 
TATAGATGAG 
AGAACCTCCA 
AACAATCATT 
TGAACGTTTA 
ATTTAtcgCT 
TATTGCATTT 
ACATGATGAA 
TTTTAAAAAA 
TATGATTTAT 
TCGAGCACTG 
TACATTAGTG 
AAAATTAGCT 
ACAAATTGCT 
AGAACTAAAA 
AAAGCCTGCG 
AGTGCTAGAT 
GATTGATCAT 
ACCTGTGGCG 
AATCGTCAAT 
TAATAAAAAC 
ATATTTACAA 
AGATATTGCT 
AGAGTGATAC 
ATAAATATTT 
CAAATGATGA 
AAAGAAGAGC 
CATAAAGAAG 
ATGCTACAAG 



AGAGACAAAG 
GTGCACATGC 
GCACACGCTA 
TCTAGGGCAC 
AAATTTGTAG 
AAAGCGTCTG 
GGTGATGGTA 
GCGTTGGATC 
TACCATCAGT 
TTTGTCaGAG 
ATGAACTTAG 
TCGATTCGTT 
GAGCAGATTA 
TCATCGCCAA 
ACACTAAAAG 
AGAGGCATAC 
AAAGCGAATA 
GCCAAAAATA 
GCAAGTGAAG 
AAAGACGACG 
GTTAAAGTTG 
AATCGTAAAA 
CAAAAAGCAA 
ATGACAAGCG 
ATTGATTTTC 
AACAAATGCA 
GTATTGTAGG 
TTGTCGACGT 
ACTTAGTGTT 



TTAAATATGC 
TAACAACAGG 
TT T TTATATT 
AACGTTTTGA 
CAGATGCACA 
AAGGGGGTAT 
CGTTAACATT 
ACTTGTTTGA 
TTATAACAGA 
ATACGATTAT 
AATTAGATAT 
TTAGTGTGAA 
AGGGTCAACC 
ACACAGATGT 
CACAAGGAGT 
AAAAATCTAA 
AGGCAGATAT 
ATGATAAAAA 
ATCACGTACT 
AGAAACGTAG 
TTGGTGTACC 
ACGAAGGCGA 
AAGATCTTTT 
ATATAATCGT 
AAGGAGGAAA 
AAAAATGCAA 
AACAGCTGGC 
TGAAATCAAA 
AGCAGCTACT 



ACCAAGTGAA 
TGCTTTTAAT 
GGCAACGACA 
TTTTAAAGCA 
ACAAATTGAA 
GCGTGATGCA 
GCAAGATGCG 
TGATATTGTA 
AGGTAAAGAA 
GAATAAAACA 
GTTATATCAA 
TCAAAACGTT 
ACAAGTGATT 
ATTGTTGCAA 
GAGTGTCGCT 
AAATGCATTT 
CAAATTGTTG 
ATCACTCGTT 
TGTGAAATTT 
TAGTATAGAA 
ATCAGATCAA 
TGATATGCCA 
CGGTGAAGAA 
ATGTATAATG 
TGGAATATGC 
AAGAAAATGG 
GGTGGCATGG 
GAAGAAGCTG 
AATGAAGCGA 



16260 
16320 
16380 
15440 
16500 
16560 
16620 
16680 
16740 
16800 
16860 
16920 
16980 
17040 
17100 
17160 
17220 
17280 
17340 
17400 
17460 
17520 
17580 
17640 
17700 
17760 
17820 
17880 
17940 
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TCCCTGGaAT GTGATCATAG ATGCATTATC 
TTATGAAATT GCCAGGCATT GGTCCAAAGA 

$ ATATGAAAGA AGACGATGTT GTTCAGTTTG 

TAACATATTG TAGCGTATGT GGTCACATTA 
ATAAGCAAAG AGATCGTTCA GTTATTTGTG 

10 TGGAAAAAAT GAGAGAATAC AAAGGTTTAT 

TGGATGGCAT TGGACCAGAA GATATTAATA 

ATGAAGTTAG CGAATTAATC TTAGCTATGA 

15 

TGTATATTTC TAGATTAGTT AAGCCTATAG 

TATGGGTAGG TGGCGATTTA GAGTATGCTG 

GTAGAACAGA AATGTAATkT CTTCTATTAA 

20 

AAGTCACAGT GTAATCATTG TGGCTTTTTT 
GCGGTGTGGC GGTGGTATGG TTTACCTAGT 

CAAGCCGTTG GTTGTGATTT GTTACTTCTA 

25 - ■ ■ 

TAGATCTATG GTTATGGTGT GTTGGTGCTA 
CAAATGAAAT TCTTTTGTAA TTGAAATGAT 
GGTCTAAAGC TTATTAAATC AGCCTGTATA 

30 

T AAATTT ATT TTTAATTTCT GGTAAAAAAA 
ATATGGTTAG AGAAAAATCT GTTTCTTGTT 

35 TTTTTAAGTT CGATTTTTAG GATAAGGGCG 

ACTGTTGTTA AGCAGTTTGA AAGCCTGTAT 
CTCAACTTAA GAAATAACTT GAATTACTAA 

40 AAATGTTAAT AAAATGTATA ATTAATTCTT 
AATGACAA^A TGTCAACGTT AATTCCAAAA 
GTATTTATGA GCTAATCAAA CATCATAATT 

45 GAACGCTGGC GGCGTGCCTA ATACATGCAA 
CTGATGTTAG CGGCGGACGG GTGAGTAACA 

ACTTCGGGAA ACCGkAGCTA ATACCGGATA 

SO 

AGACGGTCTT GCTGTCACTT ATAGATGGAT 
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CAGAACCTAT ATCAAAACTT ATTGATAGCT 18060 

CAGCCCAACG TCTGGCTTTT CATACCTTAG 18120 

CCAAAGCATT AGTAGATGTT AAGAGAGAAT 18180 

CTGAAAATGA TCCATGTTAT ATTTGTGAAG 18240 

TTGTGGAAGA TGACAAAGAT GTCATAGCTA 18300 

ATCACGTTTT ACATGGGTCT ATTTCGCCTA 18360 

TTCCTTCATT GATTGAACGC TTGAAAAACG 18420 

ACCCGAACTT AGAGGGGGAA TCTACAGCCA 184 80 

GTATCAAAGT GACGAGATTA GCACAAGGGT 18540 

ACGAAGTAAC ATTATCTAAA GCAATCGCAG 18600 

ACATTTTTGA TTTTAATACT ATAGTAAGAA 18660 

TATGGTGTGG TGTGATGTAC TACTTTATTT 18720 

TTTACTGAGG GATGGGTAAT CTTTAGGAAG . 18780 

ATAGTAATGA TGTGAATTGG. ATTATCGAAT 18840 

TTAATTTGAT AAATGCGGTT AATGACT ATG 18900 

AGATGCTGGC TTAGTAAGTT GTACTTCTTT 18960 

GCGGTGTTTT GAGAGATTAT TTAAAACTTG 1902 0 

TAACGTTCTG TTTTGCGTTT TTTTTGATTG 19080 

CTAAAAAACG TACTATTTAT AAGTGGGGAT 1914 0 

TTCAGTACAG ATGACAAAGG TGTAATTTTT 19200 

AGTATTTATT TGTTGAGGCA AACAAAACAA 19260 

CGAAAATTAA TTTTAAAAAG TTATTGACTT 19320 

GTCGGTAAGA AAAATGAACA TTGAAAACTG 19380 

AACGTAACTA TAAGTTACAA ACATTATTTA 19440 

TTTATGGAGA GTTTGATCCT GGCTCAGGAT 19500 

GTCGAGCGAA CGGACGAGAA GCTTGCTTCT 19560 

CGTGGATAAC CTACCTATAA GACTGGGATA 19620 

ATATTTTGAA CCGCATGGTT CAAAAGTGAA 19680 

CCGCGCTGCA TTAGCTAGTT GGTAAGGTAA 19740 
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GAGACACGGT CCAGACTCCT ACGGGAGGCA 
gCtGaCGGAG CAACGCCGCG TGAGTGATGA 

5 GGGAAGAACA TATGTGTAAG TAACTGTGCA 

GGCTAACTAC GTGCCAGCAG CCGCGGTAAT 
TGGGCGTAAA GCGCGCGTAG GCGGTTTTTT 

10 GTGGAGGGTC ATTGGAAACT GGAAAACTTG 

GTAG CGGTGA AATGCGCAGA GATATGGAGG 
TGTAACTGAC GCTGATGTGC GAAAgCGTGG 

15 

CCACGCCGTA AACGATGAGT GCTAAGTGTT 
AACGCATTAA GCACTCCGCC TGGGGAGTAC 
GGGGACCCGC ACAAGCGGTG GAG CATGTGG 

20 

CAAATCTTGA CATCCTTTGA CAACTCTAGA 
GACAGGTGGT GCATGGTTGT CGTCAGCTCG 
CGAGCGCAAC CCTTAAGCTT AGTTGCCATC 

25 

GTGACAAACC GGAGGAAGGT GGGGATGACG 
TACACACGTG CTACAATGGA CAATACAAAG 

30 CATAAAGTTG TTCTCAGTTC GGATTGTAGT 

CTAGTAATCG TAGATCAGCA TGCTACGGTG 
CGTCACACCA CGAGAGTTTG TAACACCCGA 

55 CGTCGAAGGT GGGACAAATG ATTGGGGTGA 

GCGQCTGGAT CACCTCCTTT CTAAGGATAT 
ATAACGTGAC ATATTGTATT CAGTTTTGAA 

40 TAAAGTGATA TTGCTTATGA AAATAAAGCA 

TACATTGAAA ACTAGATAAG TAAGTAAAAT 
AAAGAGTTTT AAATAAGCTT GAATTCATAA 

45 CACAAGATTA ATAACGCGTT TAAATCTTTT 

TGACTTATAA AAATGGTGGA AACATAGATT 

GGCACTAGAA GCCGATGAAG GACGTTACTA 

SO 

AGCTTTGATC CAGAGATTTC CGAATGGGGA 
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GCAGTAGGGA 


ATCTTCCGCA 


ATGGGCGAAA 


19860 


AGGTCTTCGG 


ATCGTAAAAC 


TCTGTTATTA 


19920 


CATCTTGACG 


GTACCTAATC 


AGAAAGCCAC 


19980 


ACGTAGGTGG 


CAAGCGTTAT 


CCGGAATTAT 


20040 


AAGTCTGATG 


TGAAAGCCCA 


CGGCTCAACC 


20100 


AGTGCAGAAG 


AGGAAAGTGG 


AATTCCATGT 


20160 


AACAC CAGTG 


GCGAAGGCGA 


CTTTCTGGTC 


20220 


GGATCAAACA 


GGATTAGATA 


CCCTGGTAGT 


20280 


AGGGGGTTTC 


CGCCCCTTAG 


TGCTGCAGCT 


20340 


GACCGCAAGt 


TGAAACTCAA 


AGGAATTGAC 


20400 


TTTAATTCGA 


AG CAACGCG A 


AGAACCTTAC 


20460 


GATAGAGCCT 


TCCCCTTCGG 


GGGACAAAGT 


20520 


TGTCGTGAGA 


TGTTGGGTTA 


AGTCCCGCAA 


20580 


ATTAAGTTGG 


GCACTCTAAG 


TTGACTGCCG 


20640 


TCAAATCATC 


ATGCCCCTTA 


TGATTTGGGC 


20700 


GGCAGCGAAA 


CCGCGAGGTC 


AAGCAAATCC 


20760 ~ 


CTGCAACTCG 


ACTACATGAA 


GCTGGAATCG 


20820 


AATACGTTCC 


CGGGTCTTGT 


ACACACCGCC 


20880 r 


AGCCGGTGGA 


GTAACCTTTT 


AGGAGCTAGC 


20940 \ 


AGTCGTAACA AGGTAGCCGT 


ATCGGAAGGT 


21000 


ATTCGGAACA 


TCTTCTTCAG 


AAGATGCGGA 


21060 


TGTTTATTTA 


ACATTCAAAT 


ATTTTTTGGT 


21120 


GTATGCGAGC 


GCTTGACTAA 


AAAGAAATTG 


21180 


ATAGATTTTA 


CCAAGCAAAA 


CCGAGTGAAT 


21240 


GAAATAATCG 


CTAGTGTTCG 


AAAGAACACT 


21300 


TATAAAAGAA 


CGTAACTTCA 


TGTTAACGTT 


21360 


AAGTTATTAA 


GGGCGCACGG 


TGGATGCCTT 


21420 


ACGACGATAT 


GCTTTGGGGA 


GCTGTAAGTA 


21480 


AACCCAGCAT 


GAGTTATGTC 


ATGTTATOGA 


21540 
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GAGGAAGAGA AAGAAAATTC GATTCCCTTA GTAGCGGCGA GCGAAACGGG AAGAGCCCAA 21660 

ACCAACAAGC TTGCTTGTTG GGGTTGTAGG ACACTCTATA CGGAGTTACA AAGGACGACA 21720 

TTAGACGAAT CATCTGGAAA GATGAATCAA AGAAGGTAAT AATCCTGTAG TCGAAAATGT 21780 

TGTCTCTCTT GAGTGGATCC TGAGTACGAC GGAGCACGTG AAATTCCGTC GGAATCTGGG 21840 

AGGACCATCT CCTAAGGCTA AATACTCTCT AGTGACCGAT AGTGAACCAG TACCGTGAGG 21900 

GAAAGGTGAA AAGCACCCCG GAAGGGGAGT GAAATAGAAC CTGAAACCGT GTGCTTACAA 21960 

GTAGTCAGAG CCCGTTAATG GGTGATGGCG TGCCTTTTGT AGAATGAACC GGCGAGTTAC 22020 

GATTTGATGC AAGGTTAAGC AGTAAATGTG GAGCCGTAGC GAAAGCGAGT CTGAATAGGG 22080 

CGTTTAGTAT TTGGTCGTAG ACCCGAAACC AGGTGATCTA CCCTTGGTCA GGTTGAAGTT 2214 0 

CAGGTAACAC TGAATGGAGG ACCGAACCGA CTTACGTTGA AAAGTGAGCG GATGAACTGA 22200 

GGGTAGCGGA GAAATTCCAA TCGAACCTGG AGATAGCTGG TTCTCTCCGA AATAGCTTTA 22260 

GGGCTAGCCT CAAGTGATGA TTATTGGAGG TAGAGCACTG TTTGGACGAG GGGCCCCTCT 223 20 

CGGGTTACCG AATTCAGACA AACTCCGAAT GCCAATTAAT TTAACTTGGG AGTCAGAACA 223 80 

TGGGTGATAA GGTCCGTGTT CGAAAGGGAA ACAGCCGAGA CCACCAGCTA AGGTCCCAAA 22440 

ATATATGTTA AGTGGAAAAG GATGTGGCGT TGCCCAGACA ACTAGGATGT TGGCTTAGAA 22500 

GCAGCCATCA TTTAAAGAGT GCGTAATAGC TCACTAGTCG AGTGACACTG CGCCGAAAAT 22560 

GTACCGGGGC TAAACATATT ACCGAAGCTG TGGATTGTCC TTTGGaCAAT GGtAGGAGAG 22620 

CGTTCTAAGG GCGTTGAAGC ATGATCGTAA GGACATGTGG AGCGCTTAGA AGTGAGAATG 226 80 

CCGGTGTGAG TAGCGAAAGA CGGGTGAGAA TCCCGTCCAC CGATTGACTA AGGTTTCCAG 22740 

AGGAAGGCTC GTCCGCTCTG GGTTAGTCGG GTCCTAAGCT GAGGCCGACA GcGTAGGCGA 22800 

TGGAJAACAG GTTGATATTC CTGTACCACC TATAATCGTT TTAATCGATG GGGGGACGCA 22860 

tAGOATAGGC GAAgcGTGcG ATTGGATTGC ACGTCTAAGC AGTAAGGCTG AGTATTAGGC 22920 

AAATCCGGTA CTCGTTAAGG CTGAGCTGTG ATGGGGAGAA GACATTGTGT CTTCGAGTCG 22980 

TTGATTTCAC-ACTGCCGAGA AAAGCCTCTA GAT AGAAAAT AGGTGCCCGT ACCGCAAACC 23040 

GACACAGGTA GTCAAGATGA GAATTCTAAG GTGAGCGAGC GAACTCTCGT TAAGGAACTC 23100 

GGCAAAATGA CCCCGTAACT TCGGGAGAAG GGGTGCTCTT TAGGGTTAAC GCCCAGAAGA 23160 

GCCGCAGTGA ATAGGCCCAA GCGACT G TTT ATCAAAAACA CAGGTCTCTG CTAAACCGTA 23220 

AGGTGATGTA TagGGcTGAC GCCTGCCCGG TGCTGGAAGG TTAAGAGGAG TGGTTAGcTT 23280 

CTGCGAAgCT ACGAATCGAA GCCCCAGTAA ACGGCGGCCG TAACTATAAC GGTCCTAAGG 23340 
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TGTCTCAACG AGAGACTCGG TGAAATCATA GTACCTGTGA AGATGCAGGT TACCCGCGAC 2 3460 

AGGACGGAAA GACCCCGTGG AGCTTTACTG TAGCCTGATA TTGAAATTCG GCACAGCTTG 2 3 520 

TACAGGATAG GTAGGAGCCT TTGAAACGTG AGCGCTAGCT TACGTGGAGG CGCTGGTGGG 23 580 

ATACTACCCT AGCTGTGTTG GCTTTCTAAC CCGCACCACT TATCGTGGTG GGAGACAGTG 23 64 0 

TCAGGCGGGC AGTTTGACTG GGGCGGTCGC CTCCTAAAAG GTAACGGAGG CGCTCAAAGG 2 3700 

TTCCCTCAGA ATGGTTGGAA ATCATTCATA GAGTGTAAAG G CAT AAGGG A GCTTGACTGC 23760 

GAGACCTACA AGTCGAGCAG GGTCGAAAGA CGGACTTAGT GATCCGGTGG TTCCGCATGG 23 820 

AAGGGCCATC GCTCAACGGA TAAAAGCTAC CCCGGGGATA ACAGGCTTAT CTCCCCCAAG 23 880 

AGTTCACATC GACGGGGAGG TTTGGCACCT CGATGTCGGC TCATCGCATC CTGGGGCTGT 23 940 

AGTCGGTCCC AAGGGTTGGg CTGTTCGCCC ATTAAAGCGG TACGCGAGCT GGGTTCAGAA 24000 

CGTCGTGAGA CAGTTOGGTC CCTATCCGTC GTGGGCGTAG GAAATTTGAG AGGAGCTGTC 24 060 

CTTAGTACGA GAGGACCGGG ATGGACATAC CTCTGGTGTA CCAGTTGTCG TGCCAACGGC 24120 

ATAGCTGGGT AGCTATGTGT GGACGGGATA AGTGCTGAAA GCATCTAAGC ATGAAGCCCC 24180 

CCTCAAGATG AGATTTCCCA ACTTCGGTTA TAAGATCCCT CAAAGATGAT GAGGTTAATA 24 240 

GGTTCGAGGT GGAAGCATGG TGACATGTGG AGCTGACGAA TACTAATCGA TCGAAGACTT 24 300 

AATCAAAATA AATGTTTTGC GAAGCAAAAT CACTTTTACT TACTATCTAG TTTTGAATGT 24360 

30 ATAAATTACA TTCATATGTC TGGTGACTAT AGCAAGGAGG TCACACCTGT TCCCATGCCG 244 2 0 
AACACAGAAG TTAAGCTCCT TAGCGTCGAT GGTAGTcGAA • CTTACGTTCC GCTAGAGTAG . 244 80 

AACGTTGCCA GGCAAAAAAT GGATGCGATG AGCCGCATTG AGACCGCAAG GTCTCTTTTT 24 540 

35 TTTATGTCTA AAACGTCAAA ATAAAAAGCA AACACAAAGA AAAATGGCTT GGCGAAGTGA 24600 

AAACDTTTGA ATCTGACGAA ACGAGAAAAG ArCGCAACGA GTTTAGTAGA GCTAAATGAG 24 660 

TAAGyGAGAG CCGAAGrAGA GGAAAGAAGC AAGCGATTGT CACAAGTCAA GAAAGGTTCT 24 720 

40 TAGCGASGAT GGTAGCCAAC TTACGTTCCG CTAGAGTAGA ACTGGAAATG ATAATTTAAT 24780 

AATGTACACT TTCGATTGTC TAAGTATGTA CAACTTTAAT TTTGTGTTTA TATAAATTTA 24 840 

AAATGATATC ATCGAAAACA AAATATTGTA TAAATAGAGA AGAGCAGTAA GACGGTATCT 24 900 

AATTGAAAAT GATCTTACTG CTCTTTTATA TACTTTATTG AAATACAAAA AGGAAATTAA 24 960 

TTATTATACA ATAGACAAGC TATTGCATAA GTAACACTAA CTTTTATCAA AGAAGTGTTA 25020 

CTTTATAATT AATGATTTTA TTAGAGCGTC TACATGCGGT TTTAAAGCAT CATCGTCTAT 25080 

ACCGCCAAAG CCTAATATAA ATTTAGGGGT TTTCTTATAG TCTTGATCAT CATCAAAATT 2514 0 



25 



45 



SO 
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TCCATTTTTT ACTGTAATTG TAAAATGCAT ACCCGTTTCA GCACCTTGAA TATCAAGCTG 
CTCTTTGTAA GGTTTCAATC TTTTTAAAAT ATAGGTTAGT TTTCTACGAT AAATTCGTCT 
CATTTTATTT AAATGCCTTT CAAAACCACC GGAAGATATA AACGTTGCAA TAAGGTTTTG 
CATATGAACA GGTACAGTGT TGCCTTCAAT GTGATTTTGA GAATGATATT TTTTCATTAT 
AGAATAGGGT AACACCATAT ATGCAACTCG ACAGCTAGGA AAAATAGACT TTGAAAATGT 
ACTGATATAA ATCACTTTTT CTCCTCTTGA AT AT AG AC CT TGAATTGCTG GAATGGGTTT 
GCCGAAATAT CTAAACTCGG AATCATAATC AT CTTCT AT A ATAAATCGTT CTTCTTTTTC 
TTGAGCCCAT TGTATTAATT GAGTTCGTTT TTTTAAGTCC ATCACATATC CAGTTGGAAA 
TTGATGGGAA GGCGTTATAT ATACTATATT TTTTTGTGAT TTAATAACTT CATCTACGTT 
TATTCCATTA TCTTCAACTT CAATTTGTTC ATATTCAACT TGTTTTTTAT CTAAAATATT 
TTTGATTGGT GGATAACTAG GTTTTTCGAT AATAAATGTT GAAGTATAAA GTAAATCGAC 
TAATTGATTT ACTAATTGTT CGGTAGATGA GCCAATTATA ATTTGATTAG GATCACAAAT 
TACGCCACGA TTAGTAAATA AATAAAATGC CAGTTGAAAC CGCAAATGTA ATTCTCCTTG 
AAAATGTCCT CTACGTAATT GATTTAAATG ATTTGTATCA TAAAGATCTT TGGAATACTT 
TCTGAAAAGT TCTATAGGGA AATGTTT CGT ATCTATTTCA TCCAAATTAA AAGCATAATC 
ATAAGCTTCA TCACTCGCTT TTGGTTTATA TGAAT CATCA TCAAAAAGAG AGGGGATAGG 
TTGATTGTTT AAAATTGTTA AAGATTCAAT TTCGGACACA AAATATCCAG AGCGAGGTCT 
TGAATAAATG TAACCTTCGT CTAATAGAAG TTGATATGCA TGCTCTACGG TTGTTTGGCT 
AATAGATAAA TGTTTGCTTA ATTGTCTTTT AGAATAAAAT TTATCGCCTT CTTTAAATTG 
ACCTTCAATT ATTTGTTTTT TTAATTTTTC ATAAAGTTGA TGGTATAAAG TGTTTTTCAA 
XTTTA TAACT GACCTCCTAA ATTTATCTTA TTTTGTACCT TTTTAAATAT CAGTTTATAC 
ATTACAATGT ATTTAATCAA CTTGAAAAGG GGTTTTATGT ATAATGAGTA AAATTATTGG 
ATCAGACAGA GTCAAAAGAG GTATGGCTGA AATGCAAAAA GGCGGCGTTA TTATGGATGT 
CGTTAATGCT GAGCAAGCAA GAATTGCAGA AGAAGCTGGC GCGGTAgCAG TTATGGCATT 
AGAACGAGTA CCTTCTGATA TTAGAGCTGC TGGTGGTGTT GCACGTATGG CAAACCCTAA 
AATTGTAGAA GAAGTAATGA ATGCTGTTTC TATTCCAGTC ATGGCTAAAG CACGTATTGG 
TCATATCACT GAAGCAAGAG TATTAGAGGC GATGGGTGTT GACTATATTG ATGAATCAGA 
AGTGTTAACA CCAGCAGATG AGGAATATCA CTTAAGAAAA GATCAATTTA CAGTACCATT 
TGTATGTGGA TGTCGTAATT TAGGTGAAgm TGCGCGTAGA ATTGGTGAAG GTGCTGCTAT 



25260 

25320 

25380 

25440 

25500 

25560 

25620 

25680 

25740 

25800 

25860 

25920 

25980 

26040 

26100 

26160 

26220 

26280 

26340 

26400 

26460 

26520 

26580 

26640 

26700 

26760 

26820 

26880 

26940 
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ACAAGTTAAT 
TGCGAAAGAT 
ACCGGTAGTT 
GGAATTAGGT 
AAAATTTGCT 
AAGATTAGCA 
AGAAGAACGT 
AGGTGCAGTA 
TAAAAAAGTT 
AACGTTACGT 
ACCTATGTTT 
AGGATACCTT 
CAGCTTTGAA 
AAGAGCCCCA 
GAAAATTGTA 
AGATGACTAT 
TGTATGCTAA 
TCAAACTTAG 
GCTATACATA 
TAATTCGATG 
ATACTAGTGT 
TGTCATTAAG 
ATACTAAGAA 
GTGCTTGCAT 
GAACTGCATC 
TTAATGTTTG 
CAATTCCATT 
CAGCTACTTT 
TTTCTTCAGT 



TCAGAAGTTA 

ATCGGTGCGC 

AACTTTGCAG 

GCTGACGGTG 

AAAGCAATTG 

AGTGAACTTG 

ATGCAAGAGC 

CGTGAACATA 

GAACAATTAG 

CGATTAATGA 

GGTACATGCG 

AACAAGTTGA 

ACAGAATTAG 

CATATTGAAA 

GCTGTTCAGC 

AGAGTAACTG 

ATCAACGAAT 

CTTTGGAGGA 

AGAAAAAAAC 

TTGATGTATT 

tGCACCGAAT 

TGATTTAATC 

TACAGATGTA 

TGCTACAAAT 

TTGCCATGGC 

GAAATCCCAA 

TAATAGAGCG 

AAATCCATCT 

TTCTTCAACT 



GTCGATTGAC 

CTTATGAAAT 

CTGGTGGCGT 

TATTCGTTGG 

TTCAAGCAAC 

GCACTGCTAT 

GTGGTTGGTA 

TTAGACATAT 

AAGAAATCGA 

ATTTATATGG 

CAGGATTAAT 

ATATTACTGT 

ATATTAAAGG 

AAGTAGGTCA 

AAGGTAAATA 

ATTACTTTAT 

TATTGATATT 

GTTATTTTTT 

CCTTCAAAGA 

TGTTAAATAA 

AATAATTTCA 

GCACCTGAAA 

ACACCTTTTG 

TCGTTAGATA 

ACACCGACTA 

GAAATAGCGC 

ATAATGGCAA 

AAAATATATT 

AATAATTTGT 



TGTAATGAAT 
TTTAAAACAA 
TGCGACTCCT 
ATCAGGTATT 
AACACATTAC 
GAAAGGTTTA 
AGATATGAAA 
TGAATTAAGT 
GGGCTTAATA 
ATTTAAAGAG 
AGTTCTAGCG 
ACAACGAAAC 
TATCGCTACA 
AGGCGTAGAT 
TTTAGGCGTA 
TAATCATATT 
TATAGATTTG 
ATGTCAAAAT 
GACTGAGAAT 
AGAATCcAGC 
AACCAAAGCG 
TAATACCGAT 
CGTGTTCAGA 
ATAGTTTTGT 
AGAATGCAAA 
CACCTGAAAC 
TGTATCCGAT 
CTCCTAGCAT 
CATCTTCTTC 









GATGATGAGA 


TTATGACTTT 


27060 


ATTAAAGACA 


ATGGTCGTTT 


27120 


CAAGATGCTG 


CTTTAATGAT 


27180 


TTTAAATCAG 


AAGATCCAGA 


27240 


CAAGACTATG 


AACTAATTGG 


27300 


GATATCAATC 


AATTATCATT 


27360 


ATAGGTGTAT 


TAGCATTACA 


27420 


GGTCATGAAG 


GTATTGCAGT 


27480 


TTACCTGGTG 


GCGAGTCTAC 


27540 


GCTTTACAAA 


ATTCAACTTT 


27600 


CAAGATATAG 


TTGGTGAAGA 


27660 


TCATTCGGTA 


GACAAGTTGA 


27720 


GATATTGAAG 


GTGTCTTTAT 


27780 


ATCCTATGTA 


AGGTTAATGA 


27840 


TCATTCCATC 


CTGAATTAAC 


27900 


GTAAAaAAAG 


CATAGCTTAA 


27960 


TTGAGAAGAA 


AATATCTCCT 


28020 ,. 


TAAAAATGAT 


AAAAAATAAA 


28080 


AGTCAAAATT 


TTGAAGGGGT 


28140 


GATTGCAGCT 


GAAATGAAAG 


28200 


GGCAACTGTA 


TCTCCTTTTT 


28260 


AGAGCTAAAG 


TTAGCAAATG 


28320 


TAAATCACTA 


AGTTTACCAA 


28380 


CGCCATAACT 


GAACCGGCTT 


28440 


TGGTGCAAAG 


ACAAAACCAA 


28500 


TGTACTAAAG 


ATATTGCTTA 


28560 


TAACATTGCG 


CCTACAATGA 


28620 


TTCGAAGAAT 


GATTGTTGTC 


28680 


ATTAACTTTA 


TAAGGGTTAA 


28740 
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TAGGTTCAAT TAAGGTAAAG TATGCACCGA TAATTGAAGC 
AAGCTGTTAA TGTGTATAAA CGTTGCTTAG GTATGTATGG 
TAAATACTTC AGATTGTCCC AAAATTGCTG CAGCAACTGC 
CCATACCATT AATTTTAGAA ATTAAGAATC CTAAAACATT 
TTGTGTATTG AAGGATACCG ATAATCGCTG AAATAAATAC 
AGAAGAATGG TGGTTGCTTA GGATCGATAT ATTGAATACC 
CATCTG CTGC TTTTAATAAT AAGTAGTTAA AACCGTTTGA 
TTCCCATTGT AGTTTTAAGC AAGATAAATG CAAAGATAAG 
CTACATATTT CCAGCGAATA TTTTTCCTGT CTGAGCTAAA 
AGAAGATAAT TCCGATAATC CCAATTAGAA TATGCATATA 
TTCTACaATc TATCATACAA TAAAATGGAA GGGCTAACAT 
AAAACAAATT AATTGAAAAA GGTCAAAATA GGTCATATAA 
AAGGGGTGAT ATACATGCAC AATATGTCTG ACATCATAGA 
TGAAGAGTCG AATGAAGATG TCGTTGAAAT TCAGAGAGCG 
TTGCGTACCA TCACAATTAA ATTATGTAAT CAAAACACGA 
TGAAATCGAA AGTAAACGTG GTGGTGGTGG TTACATCCGA 
AGATGCAACA GGTTATATTA ATCATTTGCT TCAGCTGATT 
ACAAGCTTAT TATATTATTG ATGGGCTTTT AGATAAAATG 
TAAAATGATT CAAGCAGTTA TTGATAGAGA AACGCTATCA 
TATTATTAGA GCAAATATTT TAAAACGTTT GTTACCAGTT 
ATGAGGTGTT GAAGTGCTTT GTGAAAATTG TCAACTTAAT 
AGTTACAAGT AAAAATAAAA CAGAAGAAAA AATGGTGTGT 
CCATCCGTGG AATCAAGCTA ATGAACAACC TGAaTATCAA 
AGAAGCATTT GTTGTTAAGC AAATTTTACA ACATTTAGCT 
TCAAGA 

(2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14333 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 
<D) TOPOLOGY: linear 



AGAAACAGTC GACATTGCTG 2 8860 

TAATTGTTTT TTAATTGAAA 2 8920 

ATTGTATGAT TCTAAACGTC 28980 

AATGATTAAA GGTAAAATCT 2 904 0 

GATAGGTAAT AATACACTGA 29100 

ACCGAATACA AAGTTAACAC 29160 

AATACCACCA ATAACCTTGA 2 9220 

CTGAATTGCA AGTAAAATTC 29280 

TAGAAACGCA AGTGCTAAAA 29340 

TTTCTCATTC CTTTAGTTTT 29400 

CATAAATTTT TGAAAATATA 29460 

TATAGTCAAA GAAGGTCAAA 29520 

ACAATAaTCA AACGTTTATT 295S0 

AATATCGCAC AGCGTTTTGA 29640 

TTCACTAATG AACATGGTTA 29700 

ATCACTAAAA TTGAAAATAA 29760 

GGACCTTCTA TTTCTCAACA 29820 

TTAATAAATG AACGTGAAGC 2 9880 

ATGGATATGG TTTCTAGAGA 29940 

ATAAATTATT ACTAAATGAA 30000 

GAAGCGGAAT TAAAAGTTAA 3 0060 

CAAACTTGTG CTGAGGGGCA 30120 

GAACATCAAG ATAATTTCGA 30180 

ACGAAACATG GAATTAATTT 30240 

30246 
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SEQUENCE DESCRIPTION: 


SEQ ID NO: 


57 : 








TATTCCCCCA 


TCGGTTTATT 


aaatcgtcca 


TTTCAATACT 


GTTTTTCCCC 


AAGATGTCGA 


60 


5 


TAAATC C ATT 


TCAAACGCTT 


ggacgatatc 


TTGCATCGTA 


catacattaa 


TTTCATGTCC 


120 




TTTTAATAAT 


GCTAACTTTT 


caactatgtc 


TGGGTACTTA 


cgatataaat 


caacaacttg 


180 


10 


CTCAAAATCT 


ttagagccgc 


ttcgactact 


ACCAATCAAC 


gttaatcctt 


tttcaagtac 


240 


t aat CGTGT A 


ttcacttcca 


cgggtaattc 


ACTTACGCCT 


AACAAAGCAA 


TACTGCCTTC 


300 




TC<^Tr;AAAT& 


TGTTCAACTA 


tttgttgaag 


TGCAACTTGA 


cttcctttac 


CTCCAACACA 


360 


IS 


TTPAAATHPA 


tgatcaattt 


taagatcatc 


TGGTATTTGA 


TTTACTGTAA 


AGATGTCATC 


420 






aaatgactta 


atttatagtc 


TGTCTI'ACCA aatacataag 


TTTTAGCTTC 


480 






ttacgtagca 


aaatagcagt 


aatataacct 


aagttap cat 


CACCCCAAAT 


540 


20 




gttttcaaag 


gtatagattt 


acgttcaaat 


cgttgtatag 


catgataact 


600 






aactctgtgt 


atgaaatcgt 


actcaaatca 


atgtcattag 


GCAGCGGAAC 


660 




n A T 1 A 07 A TP A 


tgtgccatca 


caacgtagtc 


ttgcataaaa 


ccatcataac 


cactagatct 


720 


25 




gaggctaagt 


aattctccgc 


aataatatga 


tgttgctctg 


taggtgtatt 


780 






actactttcg 


tacctttttc 


aaatacccct 


TTACTATCAA 


atacaacttc 


840 






tcatgaacta 


atgacattgg 


taattttttg 


cgtagtacat 


tttcatctct 


900 


30 


X VtaUAVvV» lulu 


taataccttt 


gatcagctgc 


acaaatagac 


aagtataaag 


gtcttacgat 


960 




GACATGATTA 


ccataaatat 


caacattatt 


atatgtgacg 


TCGAACTGTC 


tcggtgcaac 


1020* 


35 


GAGTTGATAT 


acttgattaa 


tcatcggcaa 


tatcaccttg 


AATAATGGCA 


tttgctactt 


1080 




ttaaatcata 


cggtgttgtc 


actttaatgt 


tgtatagttc 


TCCaCGTACC 


AATTTAACTG 


1140 




CAT5TCCAGA TTCGACAATG 


attttacatg 


catctgataa 


gatttctttt 


tgttcactac 


1200 


40 


TTAAGGCGCG 


ATAACTATCT 


tgtaataatt 


taatattaaa 


tgattgtggt 


L»l u i-i\iGCCTT 


1260 




GATACATTTC 


attccttaca 


gggatactgt 


gtatgttctg 


TTTATCTTTA 


GACATTACAA 


1320 




TCGTATCAAT 


tgcttcaatg 


ACTGTATCTA 


CTGCACCATA 


TTTTUCTGCT 


acttcaatgt 


1380 


45 


TCTCTTTAAT 


aatacgttga 


GTTAAAAATG 


GTCTTACGGC 


atcatgagtt 


ACAATCACAT 


1440 




CATCATTATT 


AATTCCATTT 


ACATTGCGAA 


tatggtcgat 


aatgttcata 


ATTGTTTCAT 


1500 


SO 


TTCGATCCGT 


accacctgca 


ACTACTTTGA 


cacgttgatc 


tgtaatgtta 


tattttttta 


1560 


AAATATCCTG 


tgtatgggaa atccactgtg 


CTGGCGTTGC 


gataataatc 


TCATTAAATT 


1620 




CACTCACTAA 


aatgaacttc 


TCAATTGTAT 


ggattaaaat 


cggtttatta 


TCAATATCTA 


1680 
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CTGCATAAAT CATGTTGTCC TCCATTCTGT CATTACATCA TTTCCATTTA TACATTACTG 1800 

ACCTATGCCC GCACATAAGC CTAACCTATT GCTCACTTGC CTCTTTTATT AATCCAAAGA i860 

TAGTTGTCAC AATAGTGTGA TAATTTTTTA TAAAAATGTA TTTTTGTAAC TGACCATTCT 1920 

AAGTTGTTTT GCCATGCAGT TAATCATTAA CTCTGACGAT ATTAAATTGT TAAAGGTATT 1980 

AATGTTTACT CTTTTTCAAA TTCATTATTA CTG C CAT CAT TTT AC CAT AT ATTATAATAA 2 04 0 

ATTTATCTTA TTAAGTGGCT GTACTTGATT TTCACTTTAA AAATTATCAA ATATTGCCAT 2100 

CTCATTTTAA GTATACAAAA TGCAAAACAA CCGATTCACA AGCATATTTC ACACAAGTAA 2160 

ACCGGCTATT TATCAACGTA TATTCGAAGA TGAATTATTT CGATAGTATC TATAGACCAG 2220 

ACGGCATTCG CACTTTCATA GCTATAACTA TACCAGCGTT TTCGTCCTCA AAGGTGCATA 2280 

CTAATAAATC GTAAACATGA CTTTATCAAA TCGTTCTTTC TTGTTAACTA ATTTATCAAA 2340 

20 TGTCTCCGGG CCTTTTTCTA ACGGTAAAAA ATGAGAAATA ATAGGCTTTA CATTAATATC 2400 

TTTCGTCTTC ATATAATGTA AGGTTGCCGT CCACTCTTTG CCCGGAAAAT TACTGGACAA 2460 

ACAGTTCCAA GAGCCACATA CTGTCAACTC GTTACGCAGA ATTTTTTCAA AATGAACGCG 2520 

25 ATCAATCTCA ATATCATCAT ATGGTATTCC GAGTAATACC ACCTGGCCAC CTTTTTTAGG 2580 

TAGCGTCAAT ATTTGACCAA TCGTAACTTT AGCACCTGAT GATTCTATAG CTAAATCGAT 2640 

TTGATTGGCG TAATGATTTT CGATGAATTT CTCAAGATTT TCTTCTTTTG AATTGATTGT 2700 

TTGATGTGCG CCCAATGATG TTGCAATATC TAGTTTATGC GCATCTATAT CTATAGCGAT 2760 

GATATGTGCA GCACCAAATA TTCGTGCCCA TTGAATAGCT AACAAACCTA TACTGCCACA 2320 

CCCCATTACT GCAACAGTCA TACCAGGTTG TATATTCGAT TTATAAAACC CATGCGCAAC 2880 

AACGGCTGAT GGCTCAACCA TTGCTGCTTC AATGTAATCA ACATTGTCTG GAACCTTTAA 2940 

AACATTTTGC GCTGGCAATT TGACATATTC CGCGAACGAT CCAGGTTCAT ATGAGCCAAT 3000 

GACGAATAAC TTTTCACATC GTGCATATTC ACCTTTTAAA CAATACTCGC ATTGATAACA 3060 

AGGTATTGCT GGGCAACCTG TCACTTTGTC GCCCACATTA ACATGCGTAA CATCACTTCC 3120 

AATGGCATCT ACTACACCTG AAAATTCATG ACCAAATGGC ATACCTTTAA TGTATGGCCC 3180 

45 CATTTTTTTG TATCGTGACG TGTCTGAACC ACATATGCCA GTCGCTCGTA CTTTAATAAT 3240 

AACGTCATTC GCACTTTCAA TGACTGGCTT TTCATTATCC TCATACCGTA AATCTTCCAC 3300 

GCCATATAAT TTCAATGCTT TCACTTGTAA ATCACCTCAA ATTTGATTTA ATTCACAACT 3360 

SO 

T TTT TLTTTT TAAAAATACC TGTCGCAAAA TAACCTGCAA TGACAATGGA ATTACTTACG 3420 

AGTAAATGTT CCATATAAAA ATCAGTGATT TGTCTTAATG GCCCAAGCAT AAAAGTTAGC 34 80 
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TGCTTTAATA 


> CCTTCGCCGG 


ATTTTA A aTYI 
•"A X X InnnlVJ 


TTP. a T & p^ 


TCGTCCCATT 


TCGAAATATC 


3600 




ATATATTTTT 


GTCACCAAAG 




T A f*T a. A & P p a 




1TGCAATGA 


3660 


5 


AGGTTCCCAA 


TCTGCTGGCT 


TTTG a nr CT 


ArTArrAara 


AL"1\iT1A 1 1 1 


C 1 l 1 1 1 vSAAT 


3720 




CACTTTTTCC 


ATATCAAATG 


GAATTTCAG^ 




A1ALL 1AI 1 1 


p r ptpt r ^ r r 
UAL, I Lj i AUAA 


3780 


10 


ACCTTTTTTG 


CGTAAAATAT 


CCAAACCTTG 


TPGTP.PTP.PT 
X i uL X v»v- 1 


P P R R PTV^ P R P 


PT*P R R P R « r%»ii/>*i 

L1GAACATTC 


3840 


AACAACAACA 


TCTGCACCGT 




ft a ■ i ™ i * p* p* r *t , *t*/^ 

AMI 1 LLnl 1 L» 


ATATACGTTT 


TT AAGT CTGT 


3900 




TTGTTGTAAA 


TTGACTACAT 


AATP fATTITn 

Art X UWli OlU 


I — r\t\ 1\jL 1 1L1 


bL i i lAlLlA 


ATCTGACTTT 


3960 


IS 


GTCATTGTCC 


AATCCAGTTA 




1 oL.Lj\_L, 1 x 1A 


CTTTTTAACA 


CTTGTGCTAC 


4020 




AAGTAATCCG 


ATTGGCCCAG 


GrO^f* ATT A P 


nnL 1 vjL 1ALA 


ltAjCCTGAAT 


TGACTTGAAT 


4080 




CTTAGAAACG 


CCATGATGTG 


PArATftPTA a 


A X v- ivj 1 \» 


a ^^rpp^p^^ ji ^ 


ACTGATACGA 


4140 


20 


TAt TCGTCTG 


GAATATGATG 




•pp a pptppi a 


T*P RPR R n 

1 bALATAAn 


AGTAAATGCG 


4200 




CCATCAACTT 




rtv- V_ X X X X V- W\ 


Try ttp raf a 


AA1TATAGTC 


TTTTGATTTA 


4260 




CAGTATTCAC 


ACTCATTACA 


AAPATAGi AT 


w X V-.V7 111 LnVj 


alti C GACACG 


4 M 1 ft ^*^^^ft ft j*'^^^ 

GTCACCAACT 


4320. 


25 


TTAAAATCTT 


TAACGTCTGC 


TCTAArTTPA 


aprtaiTTpftp 
alua ixi lal 


P R P R R R R mpi^ 

LAUAAAA l i C 


ft ft ^^^*t#V% ft ft 

ATGACCTAAT 


43 80 — 




GTCACTGGAA 


AATTAArTTT 


ATA ATT5APPT 


TPaTaaPTaT 
1 l-A iAAu lAi 


uAA lATCTGT 


^+ m ^^ft ft ft mm 

GCCACAAATT 


4440 - 




CCTGCATAAT 


GTACTTTAAT 


PTTTAPTTTA 
V» X X X iiV. X X X A 


TPSTPT&PPP 
x v_A 1 L. 1 AvjV. W» 


PTPTWnPPK m j^t 

Li 1\» 1 xvJCAAC 


TTCTTTATCA 


4500 ^ 


30 


AGAAGTTCTA 


AGTTGCCATG 




P^* i"i M Pi w r R /'*• !• R 
Villi 1AL 1A 


aaw. rncAC 


CACAAACACC 


4560 ^ 




TCGATTTITA 


ATTGAATAGA 


(fyp AAAT AGTT 
x x n\j x x 


TA A AP.ATA aP 


ATAGTTAACG 


ATATTACCAC 


4 620*- 


35 


CTTGATCAAT 


ACTTGAAATT 


T CAGATGAAP 


pTTTrnfirAT 

WX X X X OUWA X 


TTGTACATTC 


GTACCTTTCG 


4680 


CCATATCTGT 


GAAAATGGGT 


CZC^T* A f*V*STf 
w» x nv.u i^iu 


1 1 uLAA 1 A 1 A 


TAGTGAAATT 


GCAATCATAA 


4740- - 




TCGTACCCAC 


AATGACAGAA 


TGAATAATTiT 


TTPPTPTTP.P 
X X V-V. 1L1 XV9%» 


TGCACCAACA ATAAACGCGA 


4800 


40 


CAACAAATGG 


TATCGTTGCT 


AAGTCAPCAA 


AAfy/TArVTAP 


TTGGTTTCCT 


GGTAAAATAA 


4860 




CGGCTAATAA 


AACAGTGATA 


GGTACTAAAA 


TT AATP*rTf"IT 

X X nA X wx« X VJ X 


CGAAATAACT 


GCTGGATGAC 


4 920 




CTAATGCTAC 


AGCCGCATCC 


AATCCAATAT 


AAATTTPAPn 


TTCGCCAAAA 


CGTTTATTTA 


4 980 


45 


GCCATGTTCT 


TGCAGACTCT 

* ^»^»^#*ft^» a ^» A 


GAAACTGGCJV 


TT AAA PPTT P 

X X A/VlVvW X Xw 


CATTAAGATT 


TTTACCATTC 


5040 




TAGGCATTAA 


TACCATTACT 


GCAGCCATTG 


ACATTCCTAA 


ATTAATGATG 


TCTCCAGGTT 


S100 




TGTAACCTGC 


TAACACACCA 


ATACCTAAAC 


CTAAAATTAA 


GCCGACAAAT 


ATAGACTCTC 


5160 


SO 


CAAATGCGCC 


AAAACGTTTT 


TGAATTGTTT 


CAGGATCAGC 


ATCTAACTTA TTCAGACCGG 


5220 




GTACTTTTTG 


TAACAATTTA 


ACTAAGTAAA 


TACCTGGTGC 


ATAAGAAATT 


GTACTTCCTG 


5280 
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CTACTTTCAA 


ACAGATAATT 


TGGAAAATAA CTGCTGCTAA 


TAACGCTTGC 


CAAATACTGC 


5400 




CTGATACGGC 


ATAAACCATT 


GCTGCTGTAA ACGTATAATG 


CCAAAAATTC 


CAAATATCTA 


5460 


5 


CATTCATCGT 


CTTTGTCACT 


TTAGTTACTA GCAATACAAC 


GTTAACTATG 


ATTCCGAGTG 


5520 




GAATAATAAA 


TGCTGCGACA 


GATGATGCCC AAGCGATAGA 


TGATGTTGCT 


GGCCAACCTA 


5580 


10 


CATCAATCAC 


ATTCAGACTG 


ACGCCTAAAT TTTTAACCAT 


CGCTTGTGCT 


GCTGGCCCTA 


5640 


AATTTTTAAC 


TAATAAATCG 


ATGACTAAGA AAATCCCTAC 


AAAAGCCACA 


CCTATTGTTA 


5700 




AACCAGACCT 


AAATGCCGCT . 


CCAATTTTCT GCCTAAAGAA 


TAGGCCAAGC 


AAGAATATGA 


5760 


15 


CAACCGGTAA AATAACAGTt 


GCACCTAAAT CTAAAAATCC 


CCTTACAAAA 


TCAGTGAAGT 


5820 




AACTCATATT 


TAAACCCTCC 


CTGTTATATA TGCATTGTCA 


CGATACTTTC 


CGATTGTGAT 


5880 




TACATTTGAC 


GTTACAGTCA 


TTTCAACGAC AACCCTTGCT 


AAATTCGACT 


GCAGTCCTTT 


5940 


20 


TGAATTACAG 


tCACTGCGTT 


TCTATGTCAT CAACAATCAT 


TTGTCGTGAT 


AGTCATTTAT 


6000 




ATGCAATTTG 


CATATATTAA 


TATGTTATCG ACCCACGTTA 


CATATCAATT 


CCGTTATTTT 


6060 




TGTAACTCTG 


TTAAGATTTG 


TTGTTTTGTT TCTTCAATAC 


CAATACCAGT 


TAAGAAATTA 


612 0 


25 


CGTGCGTTGA 


TAACTGGGAA 


TTTATATTCT IHTriTTUTCA 


TTGCAGTTGT 


AACTAATAAA 


6180 




TCTGCAGTGT 


CTTCATAAGG 


TCCAACTTCT GTAATTTTGA 


TTTGTTTAAT 


ATCTACTTTA 


6240 


30 


ATATTGTGTT 


CCTTTGCCAT 


TTCTTCAATT GCATTATTTA 


CTACTGTTGA 


CGTTGCAATA 


6300 


CCTGCACCAC 


ACGCTACTAA 


TACTTGTTTC ATTTTCAATT 


CCTCCAATTA 


ATTTTTAGTT 


6360 




ATATTCCAAA 


TAATCATTGA 


TTAGTGTTGC TAAAATTGTT 


TCATCTTTCG 


TTCGTAGAAT 


6420 


•JO 


CTGCTCCAAT 


TTTTCTTCAC 


TTTGAAAAAT TTGCATCAAC 


TGTTGTAACA 


GCTTAAGTTG 


6480 


ATCATCTACT 


TTATCCATTG 


CTAACATAAA AACGATTTTC 


ACTTCTGTCT 


GTTGATCAAG 


6540 




TGTTCCCATT 


TCAATAAACG 


GCACTTCTTT TTCTAGAACA 


GCCACACCTA 


TCGTTCTATG 


6600 


40 


GTTAATATGT 


TCGACATCTG 


TATGCGGTAT AGCGACCGAA 


CATAGATGCG 


TTGGTAAACC 


6660 




AGTAGCAAAgi TCTTTTTCTC TGTCGATGAC TGCATCTTTA 


AACGTTGACT 


TCACGAACCC 


6720 




ATTTTGAAAT 


AACACATCTG 


ACATTTGTGA CAATACGGAT 


TCTTTATCAG 


TTGCCGACAA 


6780 


45 


ATTGAGCATT 


ATATTTTCTT 


TATGCACTAA TTGCTGTCCC 


ATCCATTTTC 


CCTCGCTTCT 


6840 




TTATTTGAAT 


AATTTTTTAA 


AATCTCATTT ACATCAGAAT 


TTTTGCGACT 


TTGTATGATG 


6900 




CGCTTAATTG 


CGTCATTGTC 


TTGCGCCACA TCTCTCAATT 


GTAGTAACGC 


TCTTAAGTGT 


6960 


SO 


GTCACTTTAT 


CAACAGCAGC 


AATAGGTACA ATAATATGGA 


TTGCTGTGCC 


ATCTGACATG 


7020 




TATATTGGTT 


CTTGTAATAT 


CAACATACTC ATCGCTGTTT 


TATGTACATG 


CTTTTCAGAG 


7080 
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TGCATCTCAT 


GAATATATTT 


AATATCAATA 


AAATGATTAG 


CAACTAACAC 


ATCACTTGCT 


7200 


5 


TTAGCAATAG 


CTTCATCAAT 


ATTTTCAACA 


TGATGCATTC 


TTTTCACGTG 


CCTTGCCGGT 


7260 




ATCAAGTCAG 


CTAAATCTAA 


TGyCTwATTT 


tGTGtGACaA 


TCGATCCATT 


AATGGTTGAA 


7320 




ATTGAATTAT 


AATTGGCAAT 


AAAATCTTCT 


AAACCATCAC 


GTAGTcTGTA 


ATGTCATTAA 


7380 


70 


CTGTCGTTGT 


GCGTTCAATT 


AATGCCATTA 


ACTTGTTTAT 


TTCCTTATCA 


ATGTCAGCCG 


7440 




ATTCCTTATT 


AATGTACTTC 


ATCACTTCTT 


TACGTAACTT 


TCGTTGCTCA 


TTTTGAGATA 


7500 




AAGCTACTTT 


TGTGATAAAT 


AATTTTTTAT 


GTGTTAGGAC 


AAACATTGGT 


GAAAAGACGA 


7560 


15 


TGTCATAATC 


TAATGTGTAA 


TTTTCAAATG 


TTCTAAGTGA 


AATCGCATCT 


AAGAAAATAA 


7620 




TTTCTGGAAA 


TAAGTTTCGC 


AACTCGTATA 


ACATCATTTG 


TGATACTGAC 


GTGCCTTGTG 


7680 




TACACACGAT 


AATAGCTTTT 


ATCTTGCCAT 


CGAAGTTTTC 


ATCTTGACGT 


CTCAAACTAC 


7740 


20 


CTCCGAACAA 


CATGGTTAAA 


TATGCTATTT 


CATTATCAGG 


CAACGATTTT 


CCGAAATATT 


7800 




CAGTTAACGA 


TTGACATGAT 


TGTTTCACCA 


TATGAAATAA 


GGATTGATAA 


TTTCCTTGTA 


7860 




AAGGATTTAT 


TAATTCATCA 


CGATCCGTTA 


AGTTATATTT 


AATCCTATAA 


AAAGCAGGCG 


7920 


25 


TTAAATGTAA 


CAAGAGTTGC 


TGTGATAATT 


TCTCCTTATC 


TTCAATGTTA 


ATAAAAGTGA 


7980^ 




'rrrcrrcAAA 


ATGGTGAATC 


ATTTGAGCGA 


TGGCCATCGT 


TAAATTCGAT 


ATGCTATCTG 


8040 


30 


ATTCTTGCAA 


ATCAGTCCAT 


TGCACACTTG 


TTGAAAGTAA 


GTGTAATGTC 


AAATATAACT 


8100* 


TTTCCGCTTC 


TGGCAAATCC 


GGCTCATGTT 


GCGTCATAAT 


CTCCGTTGCT 


TGATATTCTT 


8160' 




TCGTATCCCT 


CAAATACTGA 


TAATTAATAT 


TTAATGGATT 


CATCACATGA 


CCACTTTGAA 


8220 


35 


TTCGTCTACG 


AATCACACAA 


AGGACATAAG 


GCAATGAACT 


AAGTGATTTG 


TCTATAAAGC 


8280 




GACTCTTCAA 


AAATTGTTCT 


AC CTGTTTGA 


TCTTGTCTTT 


TTGATATGCG 


ATATCTTCGA 


8340" 




ATCCfcAAGTT 


GAGCGCCTTT 


AAAACTTCAC 


TTTTAGTAAT 


ATCATGATTC 


AACCTTTGAT 


8400 


40 


CAATCAACTT 


AATGAAGAAA 


CGGCGAACTT 


CAAATTCATC 


ACCAACAATT 


TCATAACCAT 


8460 




GTTTTCGAGA 


ATACTTAAGT 


GACAAACCAT 


GATTTTCCAA 


TTGCTCTTTC 


ACATGATTTA 


8520 




TATCGTGAAT 


GACAGTATTT 


TTACTGACTT 


GTAAATCAAT 


TGAAAAATGG 


TTTAGAGACA 


8580 


45 


TTGCGTTTTC 


CTTACTAAAA 


AGCATGAGCA 


TTAAATAATA 


ACGACGTGTT 


TCTATGCTAA 


8640 




AAATGACATT 


GTTGCCGTTT 


AACATTTGCT 


GCTCCGATAC 


ATCTCGCTTG 


AATAACGTCA 


8700 


SO 


TGATTTCAGA 


ACTTACAATA 


AAATTTCCTT 


GGCTTGTTCT 


TTCAAGTTTT 


GGATAACCCT 


8760 


CTTGTTCAAG 


CCACAAATTG 




TGCGATATCC 


TAGTTGTCTA 


CGAGACAAAC 


8820 




CAAATATCGA 


TTCAAGTTCT 


TTACCATGAA 


TAGTAGGATT 


CAATACAATT 


TCTCTGAGTA 


8880 
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w 



15 



20 



25 



30 



35 



40 



45 



SO 



TCAATCGTCA 
ATTGTGATGT 
AAATAAAGCA 
TTAACATACA 
CCCTAAGGAC 
TTATTCGTAT 
TGTAGCAAAA 
GCCATAAGTT 
TTCTGGTTGT 
AAAGTGAGTC 
AGGATTATTG 
TACATCTAAA 
AGCATCTACC 
CATCAAAGTA 
TACATTCGTA 
GTTTATAATC 
TTTCGGACCA 
TAACGGCCTA 
GTTCAAGGCG 
TAATATCTCT 
TCCEDGATCA 
GTTATCGATA 
TTTTAACTTT 
TTATCCGATA 
GACTTTTGTC 
TTCAAACAAT 
TCAAAAGCTA 
ACAGCAGTTC 
GTTGGGTATC 



CACCGATGTA 
TTTAACAACA 
ATTGAAATTT 
TATCTCAATC 
GCTTATATCA 
GTACGTAACT 
TAAGCATTGG 
AGTGGACTTT 
AACAAATACA 
TTGTATGTCT 
ATACTTTCAA 
CCAAACACAT 
CAAATATTGA 
ATAGATTCAA 
AGAGGACCTG 
ACATCTGATG 
TCTAATCCAG 
TCCGCACCTT 
TTTGTCGTAT 
AGTGGACTGT 
CAATCCATAA 
GCTTATGTAT 
CTCATATTTT 
TGCCTTATCA 
AACTCATTTT 
TTTTATGAAA 
TATTGAGAAT 
GGCAAGTCCT 
GGTGGTGCAT 



CACACTTTGA 
TTTCAATTAT 
TTGCATATAT 
ATTATCAAAT 
GGCGCCTTAG 
TATGGTCTAT 
CAGGCTTACC 
GATGTTCAAT 
AAATTGTACA 
TCTTAAAGAA 
AGCGTTCAAT 
TTATAGTAAT 
ATTCTGCTGT 
TATGCTCAGC 
TCGCTACAAT 
CTGGCATTGC 
ATTCCCCATG 
TCGCTACTGC 
TCTTGTCAAC 
CAATTGCCCC 
TAATCTTTCT 
ATTTATTTAT 
TGGATACAAA 
ACCTACCTCG 
CACAACAATA 
AATATTTTCA 
AATTAGGAGG 
TTATGTTACC 
TAAGTAATCC 



ACACATATTT 
ATCTATATTT 
TTTTGTGTTT 
TGTCATGACC 
GGTTAACTGT 
CAAGTTCCAC 
TGTAACATGA 
GTCGATATTA 
AGCATCATGT 
TTGCAATAAT 
CACGTGATCG 
CCCACTTTCA 
AGGCGTCCAA 
GATTCTTGGC 
TGTTACAGGT 
AACTGCTTGA 
TATTTCAGAA 
TATATCTTGG 
TGATTGATTA 
CGCTAAAATT 
TTTCATTTAT 
GTGGTGAATC 
CACTATTTAT 
CTAAAAATAG 
TAAACAGCAA 
TACACAGAAT 
GATGTTGATG 
TATCGCAATC 
AAACACCGTT 



TCAAAATGAG 
TTTGTGATTT 
TGTGTTTTTT 
ATTGTAACCC 
ATCTATTTAA 
ACTTCTTCAA 
TTTAAATCGA 
ACGGGTACCA 
ATTGG AC CAC 
TCTACGACGA 
TCGGCTAAAA 
AAAACACGCT 
TTTCCAAATG 
TCACGAATCA 
GTATCACTCG 
CGTGATGGTG 
GCAAAGGCAG 
CGTCCCATAA 
CCTGCGACTG 
AATGCTATTG 
ATATCCACCT 
ATGTTTATTT 
CTATTTTATG 
GATGTCTACA 
TTTATATGAT 
ATATATTGAT 
AAATCTTTAT 
TTACCAGCTG 
AAAGCATACC 



CATGTACATC 
TAATCTTTTA 
TGAAGCATTT 
AATACAAAAA 
TTAAGTATTA 
CATCAACTGC 
CAGCCATAGT 
TTGTAAACAA 
CATCCATATT 
ACTGTGCAAC 
CTTGATGTGT 
TCGCTGCTTC 
TACCACCACC 
ATGCCGTTGC 
TCATCACTTT 
TCGACGGTAG 
CTGGTTTAAT 
TATCCAATAC 
TTGTTACAGC 
CATCATCGTG 
TTCTTAAGTT 
TGAAAAATAG 
GCTTATAAAT 
TATCTATACC 
TGTTACATGA 
ATTAAATTTC 
TTGAAAAAGC 
CAGGTCTATT 
CTATTTTAGA 



9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720- 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
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AAATTTAfTf: 




CAATTGGTGT 


CGCAATCGGA 


TTATCTAGAA 


GCGATAAAGG 


10800 




TACTGCAGGT 






CTTAATTATG 


AACGCAACTA 


TGAATGGCTT 


10860 


5 


ATTAACTATC 


ArYV5GPAr*AT 


IVjVjLAAAAvjA 


TCAGCTTGCA 


CAAAATGGAC 


AAGGCATGGT 


10920 




GCTCGGTATA 


v_rtAACVJVJ X X VJ 


AAAC CGGTG T 


TTTTGGCGGG 


ATT AT CACAG 


GTATTATGAC 


10980 


10 


CGCAATACTT 




A 1 LACAAAGT 


GG T ATT A C CA 


CCGTAT 1 x AG 


GTTTCTTTGG 


11040 


TfW*TCTAf2A 
1 VJ\JV» 1 V, 1 /\\J/\ 


1 1 xvj I LUCTA 


TTGTCACAGC 


ATTTGCCGCA 


ATCTTTTTAG 


GTGTATTGAT 


11100 




P.TTTTTraTT 
ui i 1 1 1 v_/\ 1 X 


IwL CAAG CA 


TACAAGCCGG 


CATTTATCAT 


GTTGGTGGAT 


TTGTAACGAA 


11160 


1S 




A X CGGTAC i i 


TTGTTTATGG 


CTTCATCTTA 


AGATTG TT AG 


GTCCACTCGG 


11220 




■ 1 w I w 1 ' A O * f^f^ R. ^P 


AX X 1 X~X I ACT 


T AC CGTTTTG 


GCAGACGGCA 


CTTGGTGGTA 


CTTTAGAAGT 


11280 






1 lnb 1 X CAAG 


GTACG CAGAA 


CATCTTCTTT 


GCTCAACTTG 


GTGATCCAGA 


11340 


20 


TfSTfl ft PT »»P 


T ATT ATT CAG 


GTGTGT CACG 


CTTTATGTCA 


GGCCGTTTTA 


TTACGATGAT 


11400 




^P*P f^/^/^J"** 1 P A 


TGTGGTG C GG 


CACTTGCAAT 


TTATCACACA 


GCTAAACCTG 


AACATAAAAA 


11460 




ALt 1 1 vj 1 CGGC 


GGTTTAATGT 


TATCCGCTGC 


ACTCACTTCA 


rrriTAACAG 


GTATTACCGA 


11520- 


£0 


AC C X J7I AG AG 


TTTAGTTTCT 


TGTTTGTCGC 


ACCTATTCTT 


TATGTAATCC 


ATGCCTTCTT 


11580 




iuAlVjbnl 1A 


GCATTTATGA 


TGGCAGACAT 


TTTCAACATT 


ACAATTGGTC 


AAACCTTCAG 


11640 




1 vj\_»AI»tOL- ill 


ATCGATTTCT 


TACTCTTTGG 


TGTGCTACAA 


GGTAATAGTA 


AAACAAACTA 


11700^ 


30 


<-V— 1 A 1 nLo i C 


A I AC CT ATTG 


GAATTGTGTG 


GTTCTGTTTG 


TATTACATCG 


TTTTCAGATT 


11760'' 




V— 1 X nnX X /\v_vj 


AAATTTAATT 


fp^l ft 1* ft ft ft /**/~^ 

TCaaaACACC 


TGGTCGAGAA 


GATAAAGCTG 


CAGCACAACA 


11820* 


35 




At- 1 GAAAGAG 


CACAAACTAT 


TGTTGCTGGT 


TTGGGAGGCA 


AAGATAACAT 


11880 


TO A A A TPHTT 




CAACGAGACT 


ACGCGTCACA 


CTTCATCAAA 


ATGACAAAGT 


11940 




AT AAA <TT A 


X X AC X CGAAA 


v»TACTGGTGC 


CAAAGGTGTA ATCCAGCAAG 


GCACTGGTGT 


12000 


40 


Hr*A AflTA ATT 


X AX GGGCC 1 C 


ACGTTACAGT 


TATCAAAAAT 


GAAATTGAAG 


AATTGCTCGG 


12060 




fVtATTAAftAr' 
wnl X AAvXAV. 


X AAC LuAAAT 


^p/^ ft ft /*» ft ft ft 

ATCAACAGAA 


CTAATGGCAA 


CGATGTACGA 


AGTAAGAAGT 


12120 






V. X X X 1 Al XXI 


X AA xVj X X ACA 


TTTGAAGCAT 


TAAGTTCATC 


ATGCACTGTA 


12180 


45 




AAATCGCCTC 


TGCTAGACAA 


TCATCTTAAT 


GCTATGATTA 


AAGCTTAAGT 


12240 




GCCAGATTTG 


AATTTAATTT 


CAACAACGAC 


TTTCACTACA 


TTAAAAATAG 


GGCCACTCGA 


12300 




CACATATAGT 


TGTATCAAAT 


AGCCCTTTAT 


ACAArrrrrr 


GGGTAAGGTT 


TTACAATTTT 


12360 


50 


TGGGATGGTA 


TAGATTTTAT AAAAAGTTAT 


TTAAGTTCTT 


CTGCTTCAGC 


CATAATATCT 


12420 




TTTAATGTTT 


TAGCTGAATG 


TGCGAACTTG 


CTTTGTTCTT 


CGTCGTTTAA 


TGGGATTTCT 


12480 
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TCCTCATATT CGCCTTCTAA TAATGCTGAT ACAGTCAATA CGGCATCTTC ATTTCTGAAA 12600 

ATCGCTTCAG TAATTCTAGC TAATCCCATT GCAACACCAT AATAAGTGGC ACCTTTAGCT 12660 

TGAATAATGT CATATGCTGC ATCACGTGTT TGAACAAAAA TTTGTTCAAT TTGCGCTTTG 12720 

CCCTCAGGAC GTTGTTCAAG TAATGTCTTC AAAGGTTGAC CCGCAATATT AGCGTGTGAC 12780 

CATACTGGTA ATTCAGTGTC AC CATGTTC A CCAATAATTT GAGCATCGAC GCTACGTGGC 12840 

GCAACATCGn AcgyTcGCTT AACAATAATC TAAAGCGTGC AGAGTCTAAA ATTGTACCAG 12 900 

AACCTATAAC ACGTTCTTTA GGTAAACCAG AGAATTTCCA TGTTG CAT AC GCTAAAATAT 12 960 

CAACAGGATT TGTAGCTACC AAGAAAATAC CATCAAATTT TGATGCCATT ACTTCACCAA 13020 

CAATTGATTT GAATATTTTC AAGTTTTTAG ATACTAAATC TAAACGTGTT TCTCCAGGTT 13080 

TTTGTG CAGC ACCAGCACAG ATGACAACTA GATCCGCATC ATGACAATCA CTGTATTCGC 13140 

20 CAGCTTTGAC ACGAACTGTT GTTGGAGAAT ATGGTGTGGC ATGTTTTAAA TCCATAACAT 13200 

CTCCTCGAAC TTTTTCAGTG TCTAAATCAA TGATGACTAA TTCATCAACA ATGCTTTGGT 13260 

TCACTAATGA AAATGCGTAG CTTGAACCTA CTGCACCATT AC CTATTAAT ACAACTTTGT 13320" 

TCCCTTTAAA TTTGTTCATT ACAAAAACTC CCTTATGATT AATTCACTAA CATACATGTA 133 80 

GCTTCAAATA TGTTAGTTTA ATGCTGCTTA TTGACGATAC AAAAGCAAAT AAACATCTCT 13440 

TTTATTTTCA ACGCATAACT TAAAAGGTCA TGTGTCATCC GCTTTTAAGT TTGTGATTTA 13500 

TTTCACATAT AAAATGTAAC ATGCATTAAG TACTGGGTCA ATATTAAATT GTGATTTATT 13560 

TCACATTTTA TTTTAATTTT TACACCTTTT TAATTTGTAT mCGATTACAT CTTAGATGTC 13 620 

TTTAGTCTTC GTACTTCGCC AGTGATTATT TACACTTTCA CATTTTTATT ATCATGTTTA 13 680 

CTTTTTTCTA GGAAAACAAC AATGTTTTTT GAATTAGTCA AATAAATGCG CTCAATCGTC 1374 0 

GGTGTGCAAA CAGACXATTG TACACAATGC TTATTGATAA GTATTTAAAA AATTAAAAAT 13800 

GTCATACAAT TATCAAATTT GCCATTTTAT TTATATTTTC TCAAACCAAT TAATTGAATA 13860 

TCGAAATTTT TAGTAGAATA ATCAAAATAT ACAGATTAAA GGAGGAGTAT CATGCTTACA 13920 

GAACAAGAGA AAGACATTAT CAAACAAACG GTGCCTTTAC TTAAAGAGAA AGGGACAGAA 13980 

*S ATT ACGTCAA TCTTTTATCC AAAAATGTTT AAAGCGCATC CTGAA C TTTT AAACATGTTT 14040 

AATCAAACGA ACCAAAAACG AGGCATGCAA TCTT CAG CAT TAGCACAAGC TGTAATGGCC 14100 

GCAGCGGTTA ATATCGATAA CTTAAGTGTT ATTAAACCAG TCATTATGCC AGTCGCATAT 1416 0 

50 

AAACACTGCG CACTACAAGT TTATGCTGAA CATTATCCAA TTGTGGGGAA AAATTTATTA 14220 

AAAGCCATTC AAGACGTGAC AGGATTAGAA GAAAATGACC CTGTCATTCA AGCTTGGGCA 14280 
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(2) INFORMATION FOR SEQ ID NO: 58: 

(i) SEQUENCE CHARACTERISTICS: 
5 (A) LENGTH: 8779 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



w 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58: 





GGTATTTTnG 


GAnGGGTACC 


TAAAGCAATT 


CCGGCAAAGG 


GTnAATCCAG 


GTACCGAAAT 


€0 


IS 


GGACTTCCCG 


TTATCGATAA 


TACCGACATA 


TATTGTGACA 


AGTAGATTTT 


ATGGACATTT 


120 




AGGCTTACTT 


TTACTTGTGA 


TAATTGCATG 


TATGTTTACT 


GGTATTTACC 


CaTCaATACA 


180 




TATCATTCAA 


TTATTGATAT 


ATGTACCGTT 


TTGTTTTTTC 


TTAACTGCCt 


CGGTGACGTT 


240 


20 


ATTAACATCA 


ACACTCGGTG 


TGTTAGTTAG 


AGATACACAA ATGTTAATGC 


AAGCAATATT 


300 




AAGAATATTA 


TTTTACTF1T 


CACCAATTTT 


fypfin, rr & c r* a 

\9 L\a\3\- 1ALLA 


AAL*AACCATG 


GTATCAGTGG 


360 




TTTAATTCAT 


GAAATGATGA 


AATATAATCC 


n\3 111 nv 1X1 


All Vjv_ lvjAAI 


CATACCGTGC 


420 


2S 


AGCAATTTTA 


TATCACGAAT 


GGTATTTCAT 




AAA 1TAATGT 


TATACAATTT 


4 80 * 




CGGTATTGTT 


GCCATTTTCT 


TTGCAATTGG 


TGCGTACTTA 


CACATGAAAT 


ATAGAGATCA 


540~ 


30 


ATTTGCAGAC 


TTCTTGTAAT 


ATATTTATAT 


GACGAAACCC 


CGCTAACCAT 


TAATAAATGG 


600 r 


AAGTGGGGTT 


CATTTTTGTT 


TATAATTTAA 


GTAAATAACA 


TATTAAGTTG 


GTGTATTATG 


660 * 




AACGTTTTAA 


TAAAGAAATT 


TTATCATTTG 


GTAGTTCGAA 


TACTTTCTAA 


AATGATTACG 


720 " 


35 


CCTCAAGTGA 


TTGATAAACC 


GCATATCGTA 


TTTATGATGA 


CTTTTCCAGA 


AGATATTAAG 


780 " 




CCTATCATCA 


AAGCATTAAA 


TAATTCGTCG 


TATCAGAAAA 


CTGTTTTAAC 


AACACCAAAA 


840'"* 




CAACffiGCCTT ATTTATCTGA 


ACTTAGCGAC 


GATGTTGATG 


TGATAGAAAT 


GACTAATCGA 


900 


40 


ACATTGGTAA AACAAATTAA 


GGCTTTGAAA 


AGCGCGCAGA 


TGATTATTAT 


CGATAATTAT 


960 




TACCTATTGC 


TAGGTGGATA 


TAATAAGACT 


TCTAATCAAC 


ACATTGTTCA 


AACGTGGCAT 


1020 




GCAAGTGGTG 


CATTAAAAAA 


CTTTGGCTTA 


ACAGATCATC 


AAGTCGATGT 


GTCTGACAAG 


1080 


45 


GCAATGGTTC AGCAGTACCG 


TAAAGTTTAT 


CAAGCGACGG 


atitttactt 


AGTGGGTTGT 


1140 




GAACAAATGT 


CACAATGTTT 


TAAACAGTCT 


TTAGGTGCAA 


CAGAAGAGCA 


AATGCTGTAT 


1200 


50 


TTTGGGCTTC 


CGAGAATTAA 


TAAATATTAC 


ACAGCTGATA 


GAGCAACGGT 


TAAGGCAGAG 


1260 


TTAAAGGATA 


AATATGGAAT 


TACAAATAAG 


TTGGTATTAT 


ATGTACCAAC 


ATATAGAGAA 


1320 




GATAAAGCAG 


ATAATAGGGC 


TATTGATAAA 


GCTTATTTTG 


AAAAATGTTT 


ACCAGGATAT 


1380 
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ATCGACACGT CTACATTAAT GCTAATGTCA GATATAATTA TTAGCGACTA TAGTTCGCTG 1500 

CCAATAGAAG CTAG CTTGTT AGATATTCCA ACTATATTTT ATGTGTATGA TGAAGGAACA 1560 

TATGATCAGG TGAGAGGCCT GAATCAATTT TACAAAGCAA TACCGGATAG CTACAAAGTG 1620 

TATACTGAAG AAGATTTAAT AATGACGATA CAAGAAAAAG AACATCTATT AAGTCCGTTA 16 80 

TTTAAAGATT GGCATAAGTA TAATACTGAT AAAAGTTTAC ATCAGCTCAC AGAATATATA 1740 

GATAAGATGG TGACAAAATG AGGTTTACGA TAATCATACC TACATGTAAT AATGAGGCAA 1800 

CAATTCGACA ATTGTTAATA TCTATTGAGA GTAAAGAACA CTATAGAATC CTTTGTATTG 1860 

ATGGTGGTTC TACTGATCAA ACAATTCCTA TGATTGAACG GTTACAAAGA GAACTCAAGC 1920 

ATATTTCATT AATACAATTA CAAAATGCTT CGATAGCTAC GTGTATTAAT AAAGGTTTGA 1980 

TGGATATCAA AATGACAGAT CCACATGATA GTGACGCATT TATGGTCATA AAACCAACAT 204 0 

CAATC GTATT GCCAGGTAAA TTAGATAGGT TAACTGCTGC TTTCAAAAAT AATGATAATA 2100 

TTGATATGGT AATAGGGCAG CGAGCTTACA ATTAC CATGG TGAATGGAAA TTGAAAAGTG 2160 

CTGATGAGTT TATTAAAGAC AATCGAATCG TTACATTAAC GGAACAACCA GATTTGTTAT 2220. 

CAATGATGTC TTTTGACGGA AAGTTATTCA GTGCTAAATT TGCTGAATTA CAGTGTG a CG 2280 

AAACTTTAGC TAACaCATAC AATCACGCAA TACTTGTCAA GGCGATGCAA AAAGCTACGG 234 0 

ATATACATTT AGTTTCACAG ATGATTGT CG GAGATAACGA TATAGATACA CATGCTACAA 2400 

GTAACGATGA AGATTTTAAT AGATATATCA CAGAAATTAT GAAAATAAGA CAACGAGTCA 2460 

TGGAAATGTT ACTATTACCT GAACAAAGGC TATTATATAG TGATATGGTT GATCGTATTT 2520 

TATTCAATAA TTCATTAAAA TATTATATGA ACGAACACCC AGCAGTAACG CACACGACAA 2580 

TTCAACTCGT AAAAGACTAT ATTATGTCTA TGCAGCATTC TGATTATGTA TCGCAAAACA 2640 

TGTTTGACAT TATAAATACA GTTGAATTTA TTGGTGAGAA TTGGGATAGA GAAATATACG 2700 

AATTGTGGCG ACAAACATTA ATTCAAGTGG GCATTAATAG GCCGACTTAT AAAAAATTCT 2760 

TGATACAACT TAAAGGGAGA AAGTTTGCAC ATCGAACAAA ATCAATGTTA AAACGATAAC 2820 

GTGTACATTG ATGACCATAA ACTGCAATCC TATGATGTGA CAATATGAGG AGGATAACTT 2880 

AATGAAACGT GTAATAACAT ATGGCACATA TGACTTACTT CACTATGGTC ATATCGAATT 294 0 

GCTTCGTCGT GCAAGAGAGA TGGGCGATTA TTTAATAGTA GCATTATCAA CAGATGAATT 30 00 

TAATCAAATT AAACATAAAA AATCTTATTA TGATTATGAA CAACGAAAAA TGATGCTTGa 3 060 

ATCAATACGC TATGTCGATT TAGTCATTCC AGAAAAGGGC TGGGGACAAA AAGAAGACGA 3120 

TGTCGAAAAA TTTGATGTAG ATGTTTTTGT TATGGGACAT GACTGGGAAG GTGAATTCGA 3180 
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TAAAATCAAA 


CAAGAATTAT 


ATGGTAAAGA 


TGCTAAATAA 


ATTATATAGA 


ACTATCGATA 


-j j \j \j 




CTAAACGATA 


AATTAACTTA 


GGTTATTATA 


AAATAAATAT 


AAAACGGACA 


AGTTTCGCAG 


3360 


5 


CTTTATAATG 


TGCAACTTGT 


CCGTTTTTAG 


TATGTTTTAT 


TTTCTTTTTC 


TAAATAAACG 


3420 




ATTGATTATC 


ATATGAACAA 


TAAGTGCTAA 


TCCAGCGACA 


AGGCATGTAC 


CACCAATGAT 


3480 




AGTGAATAAT 


GGATGTTCTT 


CCCACATACT 


TTTAGCAACA 


GTATTTGCCT 


TTTGAATAAT 


3540 


10 


TGGCTGATGA ACTTCTACAG 


TTGGAGGTCC 


ATAATCTTTA 


TTAATAAATT 


CTCTTGGATA 


3600 




GTCCGCGTGT 


ACTTTACCAT 


CTTCGACTAC 


AAGTTTATAA 


TCTTTTTTAC 


TAAAATCACT 


3660 


75 


TGGTAAAACA 


TCGTAAAGAT 


CATTTTCAAC 


ATAATATTTC 


TT AC CATTT A 


TCCTTTGCTC 


3720 


ACCTTTAGAC 


AATATTTTTA 


CATATTTATA 


CTGATCAAAT 


GAGCGTTCCA 


TTAATGCATT 


3780 




CCCCATCATA 


TTACGTTGCT 


TCTCGCCACC 


AAGGTTTTTA 


TAGTCTCCTG 


CACCCATGAT 


384 0 


20 


AACTTGATTA 


ATTCTAAATT 


TACCTCGTTT 


GGTAGTAATC 


GTATGGTTGT 


AATTTGCTGT 


3900 




ATCACTTGAT 


CCAGTTTTTA 


AACCATCTGT 


ACCCGGCAAA 


CTCATTTTTG 


CACCTTCCAA 


3960 




TGAAAAGTTG 


AATGTGTAAT 


ACGTAACTGC 


ATGCGTTGTT 


GGTGCTAACT 


GCTTTGTAAA 


4020 


25 


GTCTAATATT 


TTAGGTGTCT 


CTTTAATCAC 


GTGTAAATCT 


AAAATGGCAT 
• 


AGTCTCTAGC 


4080 




AGTCGTTACA 


GTACGTTCTT 


GGTCTTTATA 


CTTTGTTGGT 


GCAAATGTAC 


GTAATCTTGA 


4140 




ATTTTCAGCA 


CCCGTTGGAT 


TGACGAAATG 


TGTATTTTTC 


ATTCCGATAG 


CTTTAGCTTT 


4 200 J* 


30 


GTTATTCATT 


AAATCAACGA 


AATCGCTGGT 


GTrrrrrGAA 


ACCTTCTTAG 


CTAAAATTAA 


4260 




TGCCGCGGCA 


TTACTAGAAT 


TAGATACTGT 


AATTTGTAAT 


AGGTCTGCGA 


TTGTCCATAC 


4320 r 




TTGTCCAGGA 


TATAGTTTCG 


T ATT ACT CAA 


CTCAGGTAGT 


GTAGACATAA 


TATATTCTTT 


4380 


35 


GTTCGTCATT 


GTGACTGTGT 


CATCAAGTGA 


AAGCTGCCCC 


TTATTTACAG 


CTTCCAATGT 


4440 ~ 




TAAGTACATT 


GTCATTAATT 


TAGTCATAGA 


CGCTGGAtTC 


CACTTAGTAT 


CGATATTGTA 


4500 


40 


TTGATACAGT AATTGTCCAG 


TTTGACTTAC 


ATTAACAGCA 


CTCGTCGGTT 


CGTATGCAGC 


4560 


CGACAAACCT 


GCATAACCAT 


ATTGATTTGC 


TGCTTGTACA 


GGGGTTACGT 


CACTGTTAGT 


4620 




AGCTTGTGCA 


TATGGTGTCA 


TAATACTTAA 


TGTTAAACAT 


AAAATGATGA 


TAATAGATAT 


4680 


45 


TAAATTTTTC 


ATAAAGCGTT 


AATCTTCCCT 


TTTCCAATTC 


TTAAATATTC 


CCTAAAAGCA 


4740 




ATGGTTATTC 


CTACTTACGG 


AAATCATTGC 


TAATTCACTT 


CACCTTAATT 


AAATTGTTGA 


4800 




AAATAAAGTT 


TTCTGCAGTT 


AATTTGAAAA 


ATAATGCAAA 


TATATTACGT 


GTGTAGCTAA 


4860 


SO 


AGGTGTTATA 


ATGTTTGTAC 


GAAGAGCAAA 


CTTACTCAAA 


AGCGATTAAT 


TTTCATGTTT 


4920 




TAATATAAAG 


ACTTTGAGAA 


GTTATTACAA 


AAAATGCAAT 


AGAAATATTC 


TATCATATAA 


4980 
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AAGTATATGA TAGAAATGCA TGTATCTATC TAAATGAATT AACTATAAAT TTCAAACAGA 5100 

AGAGGTAAAA CTATGAAACG AGAAAATCCA TTGTTTTTCT TATTTAAAAA ACTATCATGG 5160 

5 CCAGTGGGTC TTATCGTTGC AGCTATCACT ATTTCATCAC TAGGGAGCTT AAGTGGACTA 5220 

TTAGTGCCAC TGTTTACTGG - ACGAATTGTA GATAAATTTT GCgTGAGCCA TATCAATTGG 5280 

AATC t AATCG CATTATTTGG TGGTATCTTT GTCATCAATG CTTTATTAAG CGGATTAGGT 534 0 

10 

TTATATTTAT TAAGTAAAAT TGGTGAAAAG ATTATTTATG CGATACGCTC AGTTTTATGG 5400 

GAG CAT ATCA TACAATTAAA AATGCCATTC TTTGACAAAA ATGAAAGTGG TCAATTAATG 546 0 

AGTCGATTAA CTGACGATAC GAAAGTGATA AATGAATTTA TTTCACAAAA GCTACCTmAC 552 0 

75 

TT ATTAC CAT CAATCGTTAC AT t AGTTGGG TCACTAATCA TGTTATTTAT TTTAGATTGG 5580 

AAAATGACAT TATTAACATT TATAACGATA CCGATATTCG TTTTaATTAT GATTCCTCTA 564 0 

20 GGTCGTATTA TG CAAAAGAT ATCGACAAGT ACACAATCTG AAATTGCAAA CTTCAGTGGT 5700 

TTGTTAGGGC GTGTCCTAAC TGAAATGCGT CTTGTTAAAA TATCAAATAC AGAGCGTCTT 5760 

GAATTAGATA ATGCACATAA AAATTTGAAT GAAATATATA AATTAGGTTT AAAACAGGCT 582 0 

25 AAAATTGCGG CAGTTGTACA ACCAATTTCA GGTATAGTTA TGTTG CTAAC AATTGCAATT 5880 

ATTTTAGGTT TTGGTGCATT AG AAATTG CG ACTGGTGCAA TCACTGCAGG TACATTAATT 594 0 

GCAATGATAT TTTATGTTAT TCAGTTATCT ATGCCTTTAA TCAATCTTTC CACGTTAGTT 6000 

30 ACAGATTATA AAAAGGCAGT CGGTGCAAGT AGTAGAATAT ACGAAATCAT GCAAGAACCT 606 0 

ATTGAAC CGA CAGAAGCTCT TGAAGATTCT GAAAATGTAT TAATTGATGA CGGTGTATTG 612 0 

TCATTTGAAC ATGTAGACTT TAAATATGAT GTGAAGAAAA TATTAGATGA TGTGTCGTTC 6180 

35 

CAAATCCCAC AAGGTCAAGT GAGTGCTTTT GTAGGCCCTT CTGGGTCTGG TAAAAGTACG 624 0 

ATATTTAATC TGATAGAACG TATGTATGAA ATTGAGTCAG GTGATATTAA ATATGGCCTT 6300 

GAAAGTGTCT ATGATATCCC GTTATCTAAG TGGCGACGCA AAATTGGATA TGTTATGCAA 63 6 0 

40 

TCAAATTCGA TGATGAGTGG TACAATTAGA GACAATATTT TATACGGAAT TAATCGTCAT 6420 

GTTTCAGATG AAGAACTTAT TAATTATGCT AAATTAGCGA ACTGTCATGA TTTTATCATG 64 80 

4S CAATTTGATG AAGGATATGA CACGCTTGTA GGTGAACGAG GATTGAAACT GTCTGGCGGA 654 0 

CAACGTCAAC GTATTGATAT TGCTAGAAGT TTTGTTAAAA ATCCTGATAT TTTGTTACTT 6600 

GATGAAGCAA CAGCTAATCT CGATAGTGAA AGTGAATTGA AAATTCAAGA AGCTTTAGAA 6660 

50 ACATTGATGG AAGGTAGAAC AACGATTGTC ATTGCGCATC GTTTGTCTAC AATTAAAAAA 6720 

GCCGGTCAAA TTATATTCTT AGACAAAGGA CAGGTAACAG GTAAAGGTAC GCATTCAGAA 6780 
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TTTTATATAT 


ATAAGTAAGC 


TTGGAGCAAA 


TACACATATA 


CCATCGAGGA 


& A TT A A A /TY* 


6900 




TGGCACATTG 


ATGGATATAG 


ATGTTAATAA 


ATTGCTTCAA 


gcttttgtct 


ATTTT A A A TP 


6 960 


5 


ATTTGAGAAG 


TTACGACATA 


ATAATTCTTA 


AATTAATGAA ATCGATATTT 


TAArt A A A & A A 


7020 




ATGCTCATGG 


TATAATACAA 


GTTATAAGCA 


AACATACATA 


TATTAAATAC 


TP.T acipp 2v m 


7Q80 




AGTCATAATT 


CTTCATATTT 


TACATAGCAA 


TTTAACTGAT 


TTTAGAGTCC 




714 0 


10 


AGTTTGATAT 


TTCAATGTTT 


CTAAATTTTT 


AAAAAATTAA 


ATCATAGGTG 




7200 




GTTTTTATTA 


ATCAACATTA 


TTGG TCTAAT 


TGTATTTCTT 


GGTATTG CGG 


X AX X AX 1 11L 


7260 


IS 


AAGAGATCGC 


AAAAATATCC 


AATGGCAATC 


AATTGGGATC 


TTAGTTGTTT 


TR A A rPTTTT 


732 0 


TTTAGCATGG 


TTCTTTATTT 


ATTTTGATTG 


GGGTCAAAAA 


GCAGTAAGAG 




73 80 




TGGTATCGCT 


TGGGTAGTTC 


AGTCAGCGCA 


TGCTGGTACA 


GGTTTTGCAT 


1 X uLAAu i I T 


7440 


20 


GACAAATGTT 


AAAATGATGG 


ATATGGCTGT 


TGCAGCCTTA 


TTCCCAATAT 


1 A 1 1 AATAVjT 


7500 




GGGATTATTT 


GATATCTTAA 


* \J W ill 


TATTTTACCG 


AAAATTATTG 


bAtjb 1 A x x \3G 


7560 




TTGGTTACTA 


GCTAAAGTAA 


CAAGACAACC 


TAAATTCGAG 


TCATTCTTTG 


OviA 1 ACjAAAx 


7620. 




GATGTTCTTA 


GGAAATACTG 


AAGCATTAGC 


CGTATCAAGT 


GAGCAACTAA 


AA\Jvj 1 A i YjAA 


7680* 




TGAAATG CGT 


GTATTAACAA 


TCGCAATGAT 


GTCAATGAGC 


TCTGTATCGG 


VjAVjv»XAX X Vj 1 


774 0- 




AGGTGCGTAT 


GTACAAATGG 


TACCAGGAGA 


ACTGGTACTA 


ACGGCAATTC 


PAl^TR A AT" A f» 


7800- 


30 


CGTTAACGCG 


ATTATTGTGT 


CATGCTTGTT 


GAATC CAGTA 


AGTGTTGAAG 




786 0 




TATTATTTAC 


AGTCTTAAAA 


ACAATGAAGT 


TGAACGTCAA 


CCATTCTTCT 


V-Hl XwV. X 






AGATTCTGTA 


TTAGCAGCAG 


GTAAATTAGT 


ATTAATCATC 


ATCGCATTTG 


X X n X X M.O XXX 


1 Q O rt 


35 


TGTAGCGTTA GCTGATCTAT 


TTGATCGTTT 


TATCAATTTG 


ATTACAGGAT 


TR AT A CIC A fTP 


B04U- 




ATGOaTAGGC ATAAAAGGTA 


GTTTCGGTTT 


AAACCAAATT 


TTAGGTGTGT 


X iniUiAlLL 


o XUU 


40 


ATTTGCGCTA TTACTCGGTT 


TACCTTATGA 


TGAAGCGTGG 


TTGGTAGCAC 


AAPJLAATCtftP 




TAAGAAAATT 


GTTACAAATG 


AATTTGTTGT 


TATGGGTGAA 




ATATTP.PATP 


A O "5 A 




TTATACACCA 


CACCATCGTG 


C GGTT ATT AC 


AACATTCTTA 


ATTTCATTTG 


PAAA( "1" IV "If* 




45 


AACGATTGGT 


ATGATTATCG 


GTACATTGAA 


AGGCATTGTT 


GATAAAAAGA 


P ATP Afl A PTT 
^-n x uMunL X X 






TGTATCTAAA 


TATGTACCTA 


TGATGCTATT 


ATCAGGTATC 


CTAGTTTCAT 


TATTAACAGC 


8400 




AGCTTTCGTT 


GGTTTATTTG 


CATGGTAATA 


TGTCGAAGAG 


TGACTATGAT 


AATACATTTT 


8460 


SO 


AACTAATAAA 


TATGTCCAGG 


CATGTCGTCT 


ATTGATATAG 


GTGAGATGCT 


TGGACTTTTT 


8520 




TATTATTGAT 


ATAAAGGTAT 


nTAAATATTT 


TTAAAGTTAC 


CGAAATTGAA 


GCATTATAAA 


8580 
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GACAGTAAGG ACTAGGTACA GTCATAGTAC TTCGAGCAAA ATTTGTTTTG TTATTATAAA 8 700 

CAACACAAAG GAGATAACTT CTCTAnTGAA GAAGTTAAAA ACATTATAGC AGACAATGAA 8760 
ATGAAAGTAA ATTAAAAAT 
(2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31096 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



8779 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GTTGCAGTAG TCAAAGAATT AAACAAGGTG AAGGcGTGTA GCTTGCACAC CCGAAAATGT 60 

20 GCGTAAGTTA aCGGATGCAG GACATAAAGT AATTGTTGAA AAAAATGCTG GCATTGGTTC 120 

AGGATTTTCT AACGATATGT ATGAAAAAGA AGGCGCTAAG ATCGTAACTC ACGAACAAGC 180 

ATGGGAAGCT GATCTTGTTA TCAAAGTAAA AGAACCTCAT GAAAGCGAAT ATCAATATTT 24 0 

CAAAAAGAAT CAAATTATCT GGGGATTTTT ACATCTAGCA TCTTCAAAAG AAATAGTAGA 300 

AAAAATGCAA GAAGTTGGTG TAACTGCGAT TAGTGGTGAA ACCATTATAA AAAATGGAAA 360 

AG CAG AATT A TTAGCGCCAA TGAGTGCTAT AGCAGGTCAA CGCTCAGCAA TTATGGGAGC 42 0 

TTACTACTCT GAAGCACAAC ATGGTGGTCA AGGTACTTTA GTGACTGGTG TACATGAAAA 480 

TGTGGATATA CCTGGTAGTA CATATGTGAT TTTCGGTGGT GGAGTAGCAG CAACAAATGC 54 0 

AGCAAATGTT GCCTTGGGAC TAAATGCTAA AGTAATCATT ATCGAGTTAA ACGATGACCG 600 

CATTAAATAT CTTGAAGATA TGTATGCAGA AAAAGATGTC ACAGTAGTCA AATCAACACC 660 

AGAftAATTTA GCAGAACAAA TTAAGAAAGC AGATGTATTT ATTTCTACAA TTTTAATTTC 720 

AGGTGCGAAA CCGCCAAAAT TGGTTACTCG TGAGATGGTT AAATCAATGA AAAAAGGTTC 780 

AGTATTAATC GATATAGCTA TTGACCAAGG TGGAACTATT GAAACAATTA GACCAACTAC 84 0 

AATTTCTGAT CCAGTGTATG AAGAAGAAGG TGTGATTCAT TATGGTGTAC CAAATCAACC 900 

45 AGGAGCAGTC CCAAGAACTT CAACAATGGC ATTAGCACAA GGAAATATTG ATTATATATT 960 

AGAAATTTGT GACAAAGGCT TAGAACAAGC AATTAAAGAT AATGAAGCCT TAAGTACTGG 1020 

TGTAAACATT TACCAAGGAC AAGTGACAAA TCAAGGATTA GCTTCATCAC ATGACCTAGA 1080 

TTATAAAGAA ATATTAAATG TTATCGAATA GATAGTAATT TAAATGAAAT TGAGTGAAAT 114 0 

GAATATTTTA AATATAGCAT TATAGTTTGG ACTAAAAATT TACAAAACGG AAGGATGTAA 1200 
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TCGAAGAAGC TAAAGCAAGC 


' ATTAAAC CAT 


TTATTCGTCG 


AACACCTCTA 


A A iAAATCAA 


1320 




TGTATTTAAG CCAAAGTATA 


ACTAAAGGGA 


ATGTATTTCT 


AAAATTAGAA 




1380 


5 


TCACAGGATC TTTTAAATTT AGAGGCGCTA 


gCAATnAAAA TTAATCACTT 




1440 




CAAAAAGAAA AAGGCATTAT 


CGCAGCATCT 


GCTGGGgAAC 


CATGCACAAG 


viwl 1 OL- 1 1 X 


1 500 


10 


AACAGCTAAA TTATTAGGCA 


TTGATGCAAC 


GATTGTAATG 


CCTGAAACAG 




1560 


GAAACAACAA GCAACAAAAG 


GCTATGGGGC 


AAAGGTTATT 


TTAAAAGGTA 


nMAML I 11 AA 


1620 




CGAAACTAGA CTTTATATGG 


AAGAATTAGC 


GAAAGAAAAT 


GGCATGACAA 


TrrTTrhTpp 

1 V_Vj i 1 1 


1680 


IS 


ATATGACGAT AAGTTTGTAA 


TGGCAGGCCA 


AGGAACAATT 


GGTTTAGAAA 


111 1 ACjATCxA 


1740 


TATTTGGAAT GTGAATACAG 


TCATCGTACC 


AGTTGGCGGT 


GGAGGATTAA 


^^^^^^ A ^^^^^^ Jl 

1 1\» L-ALrijT AT 


1800 




TGCCACCGCA TTAAAATCAT 


TTAACCCTTC 


AATTCATATT 


ATCGGTGTTC 


AA1 U IVjAvJAA 


1860 


20 


TGTTCATGGT ATGGCTGAGT 


CTTTCTATAA 


GAGAGATTTA 


ACTGAACATC 


V»Avj 1 VjCjATAG 


1920 




CACAATAGCA GATGGTTGTG 


ATGTAAAAGT 


TCCTGGTGAA 


CAAACATATG 


AACj 1 AGTTAA 


1980 




ACATTTAGTA GATGAATTTA 


TTCTTGTTAC 


TGAAGAAGAA 


ATTGAACATG 


WTATGAAAGA 


204 0 


25 


TTTAATGCAG CGTGCCAAAA 


TTATTACTGA 


AGGTGCAGGC 


GCATTACCAA 


t-AC»CTGCAAT 


2100 




TTTAAGTGGA AAAATAAACA 


ATAAATGGCT 


TGAAGATAAA 


AATGTTGTTG 


CATT AG TTT C 


2160 




AGGCGGGAAT GTTGACTTAA 


CTAGAGTTTC 


AGGTGTCATT 


GAACATGGAC 


I oAATATTG C 


2220 


30 


AGATACAAGC AAGGGTGTGG 


TAGGTTAAAA 


CATTTAATCT 


TAAAAATGAG 


vj l\j 1 AA 1 TAT 


2280 




GTCAAATGGT AAAGAATTAC 


AAAAAAATAT 


AGGTTTCTTC 


TCAGCGTTTG 


v- 1 Al lVj i TAT 


234 0 




GGGGACAGTT ATTGGTTCAG 


GAGTATTCTT 


TAAAATATCA 


AACGTAACAG 


AAfiTaararir 




35 


AACAGCAGGA ATGGCCTTGT 


TTGTATGGTT 


CCTAGGCGGC 


ATCATTACCA 


1 1 1 V* 1 VjUjw 


2460 




GTTAACAGCA GCAGAACTTG 


CTGCTGCAAT 


CCCTGAAACA 


GGTGGCTTAA 


UjnAulAlAl 


O C O A 


40 


AGAATATACA TACGGTGATT 


TCTGGGGCTT 


CCTATCAGGT 


TGGGCGCAAT 


#\i 1 Inl 1 in 


O C O A 

2doU 


TTTTCCAGCT AACGTAGCAG 


CATTGT CTAT 


CGTATTTGCG 


ACACAGCTAA 




2o4 U 




CCATTTATCT ATAGGTTCGT 


TAATACCAAT 


AGCAATCGCA 


TCTGCGTTAT 


V* Inl lulu X 1 


O T A A 

270O 


45 


GATAAATTTC CTAGGTTCAA 


AAGCAGGCGG 


AATTTTACAA 


TCAGTTACTT 


XAulAAl 1AA 


2760 




ACTGATTCCA ATCATCGTTA 


TTGTAATTTT 


TGGTATTTTT 


CAATCTGGAG 


ATATCACTTT 


2820 




TTCATTAATT CCAACTACAG 


GTAATTCaGG 


AAATGGCTTC 


TTTACAGCAA 


TTGGTAGTGG 


2880 


SO 


TTT ATT AG CA ACTATGTTTG 


CATATGATGG 


TTGGATTCAT 


GTAGGAAATG 


TTGCGGGGGA 


2940 




ACTTAAAAAT CCTAAACGCG 


ATTTACCTTT 


AGCGATTTCA 


GTTGGTATCG 


GTTGTATTAT 


3000 
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TGGTAATTTA AATGCAGCTT CAGATACATC 
GATTATTACA ATCGGTATAT TAATTTCTGT 
5 TGGTATGCGC GTACCATATG CAATGGCTGA 

CGCAAAATTA ACAAAATCTG GCGCACCATG 
TAT CAT CATG ATGTCAATGG GAGCATTTGA 

10 

TTGGTTGTTC TATTGTATGT CATTTGTTGC 
TATGGAACGA CCATATAAAG TACCGTTATA 
AGGATCATTT GTATTAATTA ATACACTGTT 

IS 

TCTAATAACA GCACTTGGTA TACCAGTTTA 
AGGTAAGATA ACTAGCATTG AGAATAAATG 
ACATTAAAAT CAAAAACCAT TCAATTATTC 

20 

TTGTCTGTTT TTCAAAAGTA TAGGGAGGCA 
ATTTGAAGGC AATAATAAGT TGTTAATAGG 

25 ATTTGCACAA TCATTGGTTA ATGTTGTACC 

TGGAACGGTT AATATCGCCG TTAGTATAAC 
AGCAGGTGGT CTTGCTGATA AATATGGCAG 

30 AAATATATTA GGTTCATTAT TAATCATTAT 
AAGATTAATT CAAGGACTTT CAGCAGCATG 

GTCATATTAC ATTGGGAAAG ATAGACAACG 

35 

GGGCGGCTCT GGTGTTTGTT CATTTTTTGG 

TTGGATTTTC ATCCTATCAA TTATAATTTC 

ACCTGAAACT AAATCTAAAT CGATTTCTCT 

40 

TTTAGTCATT ATGCTCCTCA GTTTAAATAT 
AACCTCACTT CTTTTTATTA CTTTATTAGC 
AGTTCTTGAA AAGCGTGCTA CAAATCCTTT 

45 

TTACACAGGT GCAACAGCTT CAAACTTTTT 
AGCCAACACA TTTGTTCAAA GAGGTTTAGG 
SO AATCACTTAT TTAGTAATGG TACTAATTAT 
ACTCGGATGC AAGAAACCAA TGTTAATTGG 

55 
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AAAAATATTA TTTGGTGAAA ATGGCGGTAA 3X20 

TTATGGTACG ATCAATGGCT ATACTATGAC 3180 
AAGAAAATTA TTGC CATTTA GCCATTTATT ' 3240 

GTTTGGCGCA ATTATACAAC TTATAAT CGC 33 00 

TACAATTACA AATATGTTAA TCTTTGTTAT 3360 

GGTAATAATT TTAAGAAAAC GTGAACCAAA 3420 

TCCGATCATA CCTTTAATTG CTATTTTGGC 34 80 

TACACAATTT ATATTAGCAA TCATTGGAAT 3540 

TTACTATAAA AAGAAACAAA AAGCAG CAT A 3600 

GATGGACTAC TAATAAATTT AAAGTTTTAC 3660 

TATGGAACAG ACAAATTTCT GTTATGGAAT 3720 

AATAGAGATG GAAAAGCCGT CAAGAGAGGC 3780 

AATTGTTCTA AGTGTAATAA CGTTTTGGCT 3 84 0 

AATACTTGAA GATAGTTTCA ATACAGATAT 3900 

TGCTTTATTT TCAGGAATGT TTGTAGTAGG 3 960 

AATTAAACTC ACGAACATTG GTATTATCTT 4 020 

TTCAAATATT C CTTT ATT AC TTATTATAGG 4 080 

TATTATGCCT GCAACTTTGT CTATTATTAA 414 0 

CGCTTTAAGT TATTGGTCAA TTGGCTCATG 4200 

AGGTGCAGTT GCAACGCTTT TAGGTTGGCG 4 26 0 

ATTAATTGCA CTGTTTCTTA TTAAAGGCAC 4320 

AAATAAATTT GACATTAAAG GTCTGGTTCT 4380 

TTTAATTACT AAAGGATCAG AATTAGGTGT 444 0 

TATTGCAATT GGATCTTTTA GTTTATTTAT 4 500 

AATCGATTTT AAATTATTTA AAAATAAAGC 4560 

GTTAAATGGT GTTGCAGGAA CATTAATAGT 462 0 

ATATTCTTCA TTGCAAGCAG GAAGTTTATC 4680 

GATTCGTGTT GGTGAAAAGT TACTTCAAAC 474 0 

AACAGGAGTT CTTATTGTCG GAGAATGTCT 4800 
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ATTCTTTGGT TTAGGACTAG GGATATATGC TACACCATCA ACAGATACAG CAATTGCAAA 4 920 

TGCACCGTTA GAAAAAGTAG GCGTTGCTGC AGGTATCTAT AAAATGGCTT CTGCATTAGG 4 980 

TGGAGCATTT GGCGTCGCAT TGAGTGGTGC AGTATATGCA ATCGTATCAA ATATGaCAAA 504 0 

CATTTATACA GGTGcAATGa TTGnCATTAT GGTTaAATGC AGGTATGGGa ATATTATCaT 5100 

TCGTTATCAT TTTG t TACTT GTGcCTAAAC mAAACGACAC TCAATTATGA TAATTGAGAA 5160 

TTAAATTGAA ATCATACAAG TCGCTACAAT ATTAAACAAA AATATAAACC GATTCTTATG 5220 

TGTCATTATT TTAAATGAAC ATAGGGATTG GTTTTTTATT ACTCTTTTAC GCTACTTTAT 5280 

TTATAATTAT TATAAATTGT CACAAATTCA ATTTACCTTA CAATATATTT TGTGTTATTA 534 0 

TATTCTGGAG CATAAATAAA TTGTTCAACA CATAGTTGTA ATGTGTTTCA ATACTTTTTG 54 00 

GATAGATTGC GAAATTGTAT TGAATCGTCA TCGTTTTAAA TTTTTAAATG AGAATGGAAT 5460 

20 GAGCATTACA ATACACAAGC AATCAAAAGT AAATACATTC ACAACACAAC AGAGACATAA 5520 

CAACAAGATA AGGAGTGAAC AATAGCTGTG AATTATCGTG ATAAAATTCA AAAGTTTAGT 5580 

ATTCGTAAAT ATACAGTTGG TACATTTTCA A CTGT CATTG CGACATTGGT ATTTTTAGGA 564 0 

25 TTCAATACAT CACAAGCACA TGCTGCTGAA ACAAATCAAC CAGCAAGCGT GGTTAAACAG 5700 

AAACAACAAA GTAATAATGA ACAGACTGAG AATCGAGAAT CTCAAGTACA AAATTCTCAA 5760 

AATTCACAAA ATGGTCAATC ATTATCTGCT ACTCATGAAA ATGAGCAACC AAATATTAGT 5820 

CAAGCTAATT TAGTAGATCA AAAAGTAGCG CAATCATCTA CTACTAATGA TGAACAACCA 5880 

GCATCTCAAA ATGTAAATAC AAAGAAAGAT TCGGCAACGG CTGCGACAAC ACAAC CAG AT 594 0 

AAAGAACAAA GTAAGCATAA ACAAAACGAA AGTCAATCTG CTAATAAAAA TGGAAACGAC 6000 

AATAGAGCGG CTCATGTAGA AAATCATGAA GCAAATGTAG TAACAGCTTC AGATTCATCT 6060 

GATAATGGTA ACGTACAACA TGACCGAAAT GAATTACAAG CGTTTTTTGA TGCAAATTAT 6120 

CATGATTATC GCTTTATTGA CCGTGAAAAT GCAGATTCTG GCACATTTAA CTATGTAAAA 6180 

GG CATTTTTG ATAAGATTAA TACGTTATTA GGCAGTAATG ATCCAATAAA CAATAAAGAC 6240 

TTGCAACTTG CATACAAAGA ATTGGAACAA GCTGTTGCTT TAATT CGTAC AATGCCTCAA 6 3 00 

45 CGTCAACAGA CTAGCCGACG TTCAAATAGA ATTCAAACGC GTTCGGTTGA GTCAAGAGCT 63 60 

GCAGAGCCTA GATCAGTATC AGACTATCAA AATGCAAATT CATCATATTA TGTTGAAAAT 64 20 

GCTAATGATG GTTCGGGCTA TCCTGTTGGT ACATATATCa ATGCTTCTAG TAAAGGGGCG 64 80 

SO CCATATAATT TACCAACTAC ACCATGGAAT ACATTGAAGG CCTCTGACTC AAAGGAAATT 6 54 0 

GCTCTTATGA CAGCGAAACA AACTGGAGAC GGGTACCAAT GGGTTATTAA GTTTAATAAA 6600 
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GTAGGAAGAA CTGACTTTGT AACAGTTAAT TCAGATGGAA CAAATGTACA ATGGAGTCAT 6720 

GGAGCAGGAG CAGGTGCAAA TAAACCACTT CAACAAATGT GGGAATATGG AGTAAATGAT 6780 

5 CCTCATCGTT CACATGACTT TAAAATAAGA AATAGAAGTG GCCAAGTAAT ATATGACTGG 684 0 
CCAACTGTCC ATATTTATTC TTTAGAAGAT TTATCTAGAG CGAGTGATTA TTTTAGTGAA 6900 
GCTGGAGCGA CACCTGCTAC TAAAGCTTTT GGTAGACAAA ATTTTGAATA TATTAATGGT 6960 

10 CAAAAACCTG CTGAATCACC GGGTGTT CCT AAAGTTTATA CTTTCATCGG TCAAGGTGAT 7020 

v. 

G CAAGTT AT A CAATTTCATT TAAAACACAA GGTCCAACTG TTAATAAATT GTACTATGCA 7080 

GCAGGTGGGC GTGCTTTAGA GTACAATCAA TTATTTATGT ACAGTCAACT ATACGTCGAA 7140 

15 

TCAACGCAAG ACCATCAACA ACGTCTTAAT GGTTTAAGAC AAGTGGTTAA TCGTACATAT 7200 

CGCATAGGTA CAACTAAACG TGTAGAAGTG AGTCAAGGAA ATGTACAAAC GAAAAAGGTA 7260 

TTAGAAAGTA CAAACCTAAA TATAGATGAT TTTGTTGATG ATCCTTTAAG TTATGTTAAG 7320 

20 

ACG CCGAGTA ATAAAGTGTT AGGATTTTAT TCGAATAATG CAAATACTAA TGCTTTTAGA 73 80 

CCGGGTGGAG CCCAACAATT AAATGAATAT CAATTAAGTC AATTATTTAC TGATCAAAAA 7440 

25 TTACAAGAAG CAGCAAGAAC TAGAAACCCA ATAAGATTAA TGATTGGTTT CGACTATCCT 7500 

GATGCTTATG GTAATAGTGA Ac TTTAGTTC CTGTTAACTT AACGGTATTA CCTGAAATCC 7560 

AACATAATAt TaAATTCTTT AAAAATGACG ATACTCAAAA TATTGCTGAA AAACCATTTT 7620 

30 CAAAACAAGC TGGGCATCCA GTTTT CTATG TATATGCAGG TAACCAAGGG AATGCTTCCG 7680 

TGAATTTAGG TGGTAGCGTA ACATCTATTC AACCATTACG TATTAATTTA ACAAGTAATG 774 0 

AGAATTTTAC AGATAAAGAT TGGCAAATTA CAGGTATTCC GCGTACATTA CACATTGAAA 7800 

35 ACTCGACAAA TAGACCTAAT AATGCCAGAG AACGCAATAT TGAACTTGTT GGTAACTTAT 7860 

TACC&GGGGA TTACTTTGGA ACGATACGTT TTGGACGTAA AGAACAATTA TTCGAAATTC 7920 

GTGTTAAACC ACATACACCA ACAATTACAA CGACAGCTGA GCAATTAAGA GGTACAGCAT 7980 

40 

TACAAAAAGT GCCTGTTAAT ATTTCGGGAA TACCGTTGGA TCCATCGGCA TTGGTTTATT 8040 

TAGTTGCACC AACAAATCAA ACTACGAATG GTGGTAGTGA GGCAGATCAA ATACCATCTG 8100 

GTTATACGAT ACTTGCGACT GGTACAC CTG ATGGGGTGCA TAATACAATT ACTATACGAC 8160 

45 

CGCAAGATTA TGTTGTATTC ATACCACCTG TAGGTAAACA AATTAGAGCA GTAGTTTATT 8220 

ATAATAAAGT AGTTGCATCT AATATGAGTA ATGCTGTTAC TATTTTGCCA GATGACATTC 8280 

6 CACCAACAAT CAATAATCCT GTTGGAATAA ATGCCAAATA CTATCGAGGC GACGAAkCAA 8340 
CTTTACAATG GGTGTCTCTG ATAGACATTC TGGTATAAAA AATACAACTA TTACGACATT 8400 
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TACAGGTAGA 


GTGAGTATGA 


ATCAGG CATT 


TAACAGTGAT 


ATTACATTTA 


AAGTGTCAGC 


8520 


GACAGaCAAT 


GTCAATAATA 


CGACAAATGA 


TAGTCAATCT 


AAACATGTTT 


CAATTCATGT 


8580 


K H^TH It ft * liwil 

AGGTAAAATT 


ft nrrv* ft ft m ft (km 

AGTGAAGATG 


CTCATCCGAT 


TGTATTAGGA 


AATACTGAGA 


AAGTTGTAGT 


8640 


AGTCAATCCG 


ACTGCTGTAT 


CTAATGATGA 


AAAGCAAAGC 


ATAATTACTG 


CCTTTATGAA 


* 8700 


rr^ ft ft k ft n « •» 

TAAAAACCAA 


AATATAAGAG 


GATATTTAGC 


ATCAACTGAT 


CCAGTAACTG 


TCGATAATAA 


8760 


TGGTAATGTC 


ACATTACATT 


ACCGTGATGG 


CTCATCGACA 


ACGCTTGATG 


CTACAAATGT 


8820 


GATGACATAC 


GAAC CAGTTG 


TGAAACCTGA 


ATACGAAACT 


GTCAATGCTG 


CTAAAACAGC 


8880 


AACGGTAACG 


ATTGCTAAAG 


GACAATCATT 


TAGTATTGGT 


GATATTAAAC 


AATATTTTAC 


8940 


TTTAAGTAAT 


GGACAACCTA 


TTCCAAGTGG 


CACATTTACA 


AATATTACAT 


CTGATAGAAC 


9000 


TATTCCAACT 


GCACAAGAAG 


TTAGTCAAAT 


GAACGCAGGC 


ACGCAGTTAT 


ACCATATAAC 


9060 


TGCTACAAAT 


GCGTATCATA 


AAGATAGTGA 


AGACTTCTAT 


ATTAGTTTGA 


AAATCATCGA 


9120 


TGTGAAACAA 


CGAGAAGGCG 


ATCAACGTGT 


ATATCGTACA 


TCAACATATG 


ATTTAACTAC 


9180 


TGATGAAATC 


TCAAAAGTAA 


AACAAGCATT 


TATTAATGCA 


AATAGAGATG 


TAATTACGCT 


9240 


TGCCGAAGGT 


GATATTTCAG 


TTACAAATAC 


ACCTAATGGT 


GCTAATGTAA 


GTACTATTAC 


9300 


AGTAAATATT 


AATAAAGGTC 


GATTAACGAA 


ATCATTCGCG 


TCAAACCTAG 


CTAATATGAA 


9360 


TTTCTTGCGT 


TGGGTTAATT 


TCCCACAAGA 


TTATACAGTG 


ACATGGACGA 


ATGCAAAAAT 


9420 


TGCAAACAGA 


CCAACAGATG 


GTGGTTTATC 


ATGGTCTGAT 


GACCATAAAT 


CTTTAATTTA 


9480 


TCGTTATGAT 


GCTACATTAG 


GTACTCAAAT 


TACGACGAAT 


GATATTTTAA 


CAATGTTAAA 


9540 


AGCAACAACT 


ACAGTGCCTG 


GATTGCGAAA 


TAACATTACT 


GGTAATGAAA 


AATCACAAGC 


9600 


AGAAGCTGGC 


GGAAGACCTA 


ACTTTAGAAC 


GACTGGTTAT 


TCACAATCAA 


ATGCGACAAC 


9660 


TGATGGTCAA 


CGTCAATTTA 


CGTTGAATGG 


TCAAGTGATT 


CAAGTGTTAG 


ACATCATCAA 


9720 


CCCTTCAAAC 


GGTTATGGTG 


GGCAACCTGT 


TACAAATTCA 


AATACTCGTG 


CAAACCATAG 


9780 


TAACTCAACT 


GTTGTTAACG 


TAAACGAACC 


GGCAGCTAAT 


GGTGcTGGCG 


CATTTACAAT 


9840 


TGACCACGTT 


GTAAAAAGTA 


ATTCTACACA 


TAATGCAAGT 


GATGCAGTTT 


ATAAAGCACA 


9900 


GTTATACTTA 


ACGCCATATG 


GTCCAAAACA 


ATATGTTGAA 


CATTTAAATC 


AAAATACAGG 


9960 


AAATACTACT 


GACGCTATTA 


ACATTTATTT 


TGTACCAAGT 


GACTTAGTGA 


ATCCAACAAT 


10020 


TTCAGTAGGT 


AATTACACTA 


ATCATCAAGT 


GTTCTCAGGT 


GAAACATTTA 


CAAATACTAT 


10080 


TACAGCGAAT 


GATAACTTTG 


GTGTG CAATC 


TGTAACTGTA 


CCAAATACAT 


CACAAATTAC 


10140 


AGGTACTGTT 


GATAATAACC 


ATCAACATGT 


TTCTGCAACG 


GCACCAAATG 


TGACATCAGC 


10200 
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GTTCAATGTA ACAGTGAAAC CTTTGCGTGA 
TGCTAATCCT GTGAGAATTG CCAATATTTC 

5 AACGACAATT ATTAATTCGT TAACGTTTAC 

AGCAAGTGCG AATGAAATCA CTAGTAAAAC 
TGCCAATGTg CACAGTAACT GTTACTTATC 

10 CTGTAAAGCA TGTCATTCCA GAAATCGTTG 

ACTTCCCAGC AGGTAATGGT TCTAGTGCAT 
ACATTGCAGA TGCAACTATT ACATGGGTAA 

15 

GTATTGGTGA AGATATAACT GTAACTGCAC 
TTACGAAAAC AGCAACATAT AAAGTAGTAA 
CCAGAGGTGT TTTATACCCA GGTGTTTCAG 

20 

CAGTAAATAA TTCTTGGTCG ACAAATGCGC 
ATGGTCCTAA CAAAGATGTT GTAGGCATAT 

25 ATAGACAAAC AGAAGATTTA ACTATTTTAT 

ACGCAAACTC TGTGACATAT AAAGCAGGTC 
TATTAAATAA CTCGTCAGTA AAATTATTTA 

30 ATATTACTCA TGGTAGCGGT TTTAGTTCGG 

GCGGAATTAA AGCAAAATCT TCAATTTCAA 
AACATGGTCA AGTTGTTACA GTAACAAGAA 

35 CAGTAACAGT GACACCACAA TTACAAGCAA 
GCGACGGTTT TGATTTCGGA cacgtagaaa 
CGGTTGCATG GCATGATAGT CCAGATACAT 

40 

CTGCGGTTGT AACATTACCT AATGGTCAAG 
TTTATCCAGT TGCTAATGCA AAGGCGCCAT 
ATGGAACGGA TGCGATGAAC TACATTACAT 

45 

CTGCAGCATG GGCAAATAGA CAACAACCAA 
ATGTCGATGT CACATATCCA GGTATTTCAG 
SO TATATCAATT TGAATTCCCT CAAACTACTT 
GTGGTACGCA AG CATCAGGA TATGCACATA 
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TAAATATCGA GTTGGTACTT CATCAACGGC 10320 

GAATAATGCG ACAGTATCAC AAGCTGATCA 10380 

TGAAACAGTA CGAAATAGAA GTTATGCAAG 10440 

AGTTAGTAAT GTCAGTCGTA CTGGAAATAA 10500 

AAGATGGAAC AACATCAACA GTGACTGTAC 10560 

CACATTCGCA TTACACTGTA CAAGGCCAAG 10620 

CAGATTACTT TAAGTTATCT AATGGTAGTG 10680 

GTGGACAAGC GCCAAATAAA GATAATACAC 10740 

ATATCTTAAT TGATGGCGAA ACAACGCCGA 10800 

GAACTGTACC GAAACATGTC TTTGAAACAG 10860 

ATATGTATGA TGCGAAACAA TATGTTAAGC 10920 

AACATATGAA TTTCCAATTT GTTGGAACAT 10 980 

CTACTCGTCT TATTAGAGTG ACATATGATA 11040 

CTAAAGTTAA ACGTGACCCA CCTAGAATTG 11100 

TTACAAACCA AGAAATTAAA GTTAATAACG 11160 

AAGCAGATAA TACACCATTA AATGTCACAA 11220 

TTGTGACAGT AAGTGACGCG TTACCAAATG 11280 

TGAACAATGT GACGTATACG ACGCAAGACG 11340 

ATGAATCTGT TGATTCAAAT GACAGTGCAa 11400 

CTACTGAAGG CGCTGTATTT ATTAAAGGTG 11460 

GATTTATTCA AAACCCGCCA CATGGGGCAA 11520 

GGAAGAATAC AGTCGGTAAC ACTCATAAAA 11580 

GTACGCGTAA TGTTGAAGTT CCAGTCAAAG 1164 0 

CACGTGATGT GAAAGGTCAA AATTTGACTA 11700 

TTGATCCAAA TACAAACACA AATGGTATCA 1176 0 

ATAACCAACA AGCAGGCGTG CAACATTTAA 11820 

CTGCTAAACG AGTTCCTGTT ACTGTTAATG 118 80 

ATACGACAAC GGTTGGAGGC ACTTTAGCAA 11940 

TGCAAAATGC TACTGGTTTA CCAACAGATG 12000 
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/-• aaTPTfvy*r 
uAAlu 1 LjVj\_ I 


AAAGTCGTTA 


ACGCAAAATA 


TGACGTCATC 


TATAACGGAC 


12120 


J1*^2V ( 1 — * 1 1 — 1 T 1 — 1 — ¥ 


It 21 TPTTT & 


w LAuCGAAAT 


TTGTAGTAAA 


AGATGTGCAA 


CCAGCGAAAC 


12180 


CAACTGTGAC 




pip* japip^ ta p*p"<** ta 


T^TYA /^TV KfTWTV^n 

TTACAATTGC 


ACCTGGAGCA AACCAAACAG 


12240 






P*!» TV TV JA /"^ » rr» 

G 1 AALGACAT 


ACGCTGATAA 


ATTAGTTATT 


AAACGTAATG 


12300 






JA f JA f^t^T*fr^ TV 

At- AC.GTCGCA 


TV HP IV IV IV « 

ATAATACGAG 


TCCATGGGTG 


AAAGAAGCAT 


12360 




TP^T TA P* P* TA m T 


ATTGCTGGAA 


CTAATAATGG 


TATTACTGTT 


GCAGCAGGTA 


12420 




TP" PTY"^ TATTA P* JV 

i. vjt_ I ALA 


ATTCAAGTTG 


TTGCAACGCA 


AGGAAGCGGA 


GAGACAGTGA 


12480 




ACGTAGTGAT 


G ATTT CACAG 


TTGTCGGACC 


ACAACCGAAC 


CAAGOGACTA 


12540 


/"^T^ * TV /' jv ^^^rt^p^^ 


7V TV IV > qv/^T 

GCAAAATGGT 


CATATTGATA 


TCACGCCTAA 


TAATCCATCA 


GGACATTTAA 

\JV*/\VJ\ XXX 


12600 


X lAATLLAAU 


W» fV^f*TV IV *TV* 

TCAAG CAATG 


GATATTGCTT 


ACACTGAAAA 


AGTGGGTAAT 


wvs x wwiunnv* 


12660 


A J. Au 1 AAuAL 


AATTAATG TT 


GTTCGTGGTC 


AAAATAATCA 


ATGGACAATT 


G CG AATAAG C 


12720 


CTGA CT ATG T 


IV TV ^^•|»l«fk /-» JV T» 

AACGTTAGAT 


GCACAAACTvj 


GTAAAGTGAC 


GTTCAATGCC 


AATACTATAA 


12780 




91 T^ IV IV T*/"' * * 

AT CAATCACA 


ATTACTCCGA 


AAGCAGGTAC 


AGGTCACTCA 


GTAAGTAGTA 


12840 - 


IA T^P*/ - * JV TV /■~»«T» IV 


ATTAACTGCA 


CCGGCAGCTC 


ATACTGTCAA 


CACAACTGAA 


ATTGTGAAAG 


12900 


ATTATGGTT C 


TV IV IV ^P^iiii JV TV m 

AAATG T AA CA 


GCAGCTGAAA 


TTAACAATGC 


AGTTCaAGTT 


GCTAATAAAC 


12960 


^""1*11 /'"IV TV T\ O 


GATTAAAAAT 


GGCACAGCAA 


TGCCTACTAA 


TTTAGCTGGT 


GGTAGCACAA 


13020^ 




TV **TV"^ TV P* TV / "H TV 

TGTGACAGTA 


ACTTACAATG 


ATGGTAGTAC 


TGAAGAAGTA 


CAAGAGTCCA 


13080 


X X X X V— >^L_M>\>\ 


IPPffPR T7A ft. TV 

AGIAtGA 1 AAA 


CGTGAGTTAA 


TCACAGCTAA 


AAATCATTTA 


GATGATCCAG 


13140 ; 


1 AMljV_AV_ X oA 


J\ r^CT* T\ TV TV JV TV^" 

AoO 1 AAAAAG 


C CAGGTACAA 


TTACGCAGTA 


CAATAATGCA 


ATGCATAATG 


13200*^* 


fY2f* && f* IV 21 P* & 


TV JATV^IV 7A T fA P*T» 

AA rtAAT ACT 


TV W TV TV ^TA ^ 

GCGAAAACZAG 


IV JV /*^^ a V JV S"V1l TV ^ W 

AAGCACAACA 


AGTGATTAAT 


AATGAGCGTG 


13260^ 


p*7A apftppira 


AL-AAGTTTCT 


^ JV JV /*«<T ■ » iv 

GACGCACTAA 


CTAAAGTTCG 


TGCAGCACAA 


ACTAAGATTG 


13320 




AVJWil 1 AC 11 


/^•TATAJVTAT'TVTVTV/^ 

WVAAATAAAG 


JV TV TV *T*TV TV JV 

AAGATAATAG 


CCAATTAGTA 


ACGTCTAAAA 


13380 


aTAArTTapa 

■*\ 1 X X nwi 


AAV» 1 ILlulu 


TA IV TV TV /'»«■« TV ^ 

AAvJGAAGTAC 


CATCAACTGC 


TGGTATGACG 


CAACAAAGTA 


13440 


1TP.HTZI &PT& 


TJV J\ *TY*»/"v~» ta TAP* 

A AA i\j<-GAAG 


TA TV PV*^^/^ TV TV ^ 

AAGCVjTGAAG 


y-« JV a jv jv Jv / Ml v/^ JV 

CAGAAACTGA 


AATAACTGCA 


GCTCAACGTG 


13500 


TTHTTPiPail 
1 Ini IbALAA 


'1 W*P*0 TV TV /*» TV 

xXjGt-GATGCA 


JV / ••f*^*^^ - JV ^*JV IV ^ 

ACTGCACAAC 


AAATTTCAGA 


TGAAAAACAT 


CGTGTCGATA 


13560 


ACGCATTAAC 


AGCATTAAAC 


CAAGCGAAAC 


ATGATTTAAC 


TGCAGATACA 


CATGC CTT AG 


13620 


AGCAAGCAGT 


GCAACAATTG 


AATCGCACAG 


GTACAACGAC 


TGGTAAGAAG 


CCGGCAAGTA 


13680 


TTACTGCTTA 


CAATAATTCG 


ATTCGTGCAC 


TTCAAAGTGA 


CTTAACAAGT 


GCTAAAAATA 


13740 


GCGCTAATGC 


TATTATTCAA 


AAGCCAATAA 


GAACAGTACA 


AGAAGTGCAA 


TCTGCGTTAA 


13800 
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CTGATAATAG TGCTTTAAAA ACTGCTAAGA CGAAACTTGA TGAAGAAATC AATAAATCAG 
TAACTACTGA TGGTATGACA CAATCATCAA TCCAAGCATA TGAAAATGCT AAACGTGCGG 
GTCAAACAGA ATCAACAAAT GCACAAAATG TTATTAACAA TGGTGATGCG ACTGACCAAC 
AAATTGCCGC AGAAAAAACA AAAGTAGAAG AAAAATATAA TAGCTTAAAA CAAGCAATTG 
CTGGATTAAC TCCAGACTTG GCACCATTAC AAACTGCAAA AACTCAGTTG CAAAATGATA 
TTGATCAGCC AACGAGTACG ACTGGTATGA CAAGCGCATC TATTGCAGCA TTTAATGAAA 
AACTTTCAGC AGCTAGAACT AAAATTCAAG AAATTGATCG TGTATTAGCC TCACATCCAG 
ATGTTGCGAC AATACGTCAA AACGTGACAG CAGCGAATGC CGCTAAATCA GCACTTGATC 
AAGCACGTAA TGGCTTAACA GTCGATAAAG CGCCTTTAGA AAATGCGAAA AATCAACTAC 
AACATAGTAT TGACACGCAA ACAAGTACAA CTGGTATGAC ACAAGACTCT ATAAATGCAT 
ACAATGCGAA GTTAACAGCT GCACGTAATA AGATTCAACA AATCAATCAA GTATTAGCAG 
GTTCACCGAC TGTAGAACAA ATTAATACAA ATACGTCTAC AGCAAATCAA GCTAAATCTG 
ATTTAGATCA TGCACGTCAA GCTTTAACAC CAG AT AAAG C GCCGCTTCAA ACTGCGAAAA 
CGCAATTAGA ACAAAGCATT AATCAACCAA CGGATACAAC AGGTATGACG ACCGCTTCGT 
TAAATGCGTA CAACCAAAAA TTACAAGCAG CGCGTCAAAA GTTAACTGAA ATTAATCAAG 
TGTTGAATGG CAACCCAACT GTCCAAAATA TCAATGATAA AGTGACAGAG GCAAACCAAG 
CTAAGGATCA ATTAAATACA GCACGTCAAG GTTTAACATT AGATAGACAG CCAGCGTTAA 
CAACATTACA TGGTGCATCT AACTTAAACC AAG CACAACA AAATAATTTC ACGCAACAAA 
TTAATGCTGC TCAAAATcAT GCtGCGCTTG AAACAATTAA GTCTAACATT ACGGCTTTAA 
ATACTGCGAT GACGAAATTA AAAGACAGTG TTGCGGATAA TAATACAATT AAATCAGATC 
AAAATTACAC TGACGCAACA CCAGCTAATA AACAAGCGTA TGATAATGCA GTTAATGCGG 
CTAAAGGTGT CATTGGAGAA ACGACTAATC CAACGATGGA TGTTAACACA GTGAACCAAA 
AAGCAGCATC TGTTAAATCG ACGAAAGATG CTTTAGATGG TCAACAAAAC TTACAACGTG 
CGAAAACAGA AGCAACAAAT GCGATTACGC ATG CAAGTGA TTTAAACCAA GCACAAAAGA 
ATGCATTAAC ACAACAAGTG AATAGTGcAC AAAACGTGCA AGCAGTAAAT GATATTAAAC 
AAACGACTCA AAGCTTAAAT ACTGCTATGA CAGGTTTAAA ACGTGGCGTT GCTAATCATA 
ACCAAGTCGT ACAAAGTGAT AATTATGTCA ACGCAGATAC TAATAAGAAA AATGATTACA 
ACAATGCATA CAACCATGCG AATGACATTA TTAATGGTAA TGCACAACAT CCAGTTATAA 
CAC CAAGTGA TGTTAACAAT GCTTTATCAA ATGTCACAAG TAAAGAACAT GCATTGAATG 
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ATTTAAATAA TGCACAACGT CAAAACTTAC AATCGCAAAT TAATGGTGCG CATCAAATTG 15720 

ATGCAGTTAA TACAATTAAG CAAAATGCAA CAAACTTGAA TAGTGCAATG GGTAACTTAA 15780 

5 GACAAGCTGT TGCAGATAAA GATCAAGTGA AACGTACAGA AGATTATGCG GATGCAGATA 15 84 0 

CAGCTAAACA AAATG CAT AT AACAGTGCAG TTTCAAGTGC CGAAACAATC ATTAATCAAA 15900 

CAACAAATCC AACGATGTCT GTTGATGATG TTAATCGTGC AACTTCAGCT GTTACTTCTA 15 960 

10 

ATAAAAATGC ATTAAATGGT TATGAAAAAT TAGCACAATC TAAAACAGAT GCTGCAAGAG 16 020 

CAATTGATGC ATTACCACAT TTAAATAATG CACAAAAAGC AGATGTTAAA TCTAAAATTA 16 0 80 

ATGCTGCATC AAATATTGCT GGCGTAAATA CTGTTAAACA ACAAGGTACA GATTTAAATA 1614 0 

75 

CAkCGATGGg TAACTTGCAA GGTGCAATCA ATGATGAACA AACGACGCTT AATAGTCAAA 16200 

ACTATCAAGA TGCGACACCT AGTAAGAAAA CAGCATACAC AAATGCGGTA CAAGCTGCGA 16260 

2Q AAGATATTTT AAATAAATCA AATGGTCAAA ATAAAACGAA AGATCAAGTT ACTGAAGCGA 16320 

TGAATCAAGT GAATTCTGCT AAAAATAACT TAGATGGTAC GCGTTTATTA GAf CAAGCGA 16380 

nCAAaCAGCA AAACAGCAGT TAAATAATAT GACGCATTTA ACAACTGCAC AAAAAACGAA 16440 

25 TTTAACAAAC CAAATTAATA GTGGTACTAC TGTCGCTGGT GTTCAAACGG TTCAATCAAA 16500 

TGCCAATACA TTAGATCAAG CCATGAATAC GTTAAGACAA AGTATTGCCA ACAAAGATGC 16560 

GACTAAAGCA AGTGAAGATT ACGTAGATGC TAATAATGAT AAGCAAACAG CATATAACAA 16620 

30 CGCAGTAGCT GCTGCTGAAA CGATTATTAA TGCTAATAGT AATC CAGAAA TGAATCCAAG 16680 

TACGATTACA CAAAAAGCAG AGCAAGTGAA TAGTTCTAAA ACGGCACTTA ACGGTGATGA 1674 0 

AAACTTAGCT GCTGCAAAAC AAAATGCGAA AACGTACTTA AACACATTGA CAAGTATTAC 16800 

35 

AGATGCTCAA AAGAACAATT TGATTAGTCA AATTACTAGT GCGACAAGAG TGAGTGGTGT 16860 

TGATACTGTA AAACAAAATG CGCAACATCT AGACCAAGCT ATGGCTAGCT TACAGAATGG 16920 

TATTAACAAC GAATCTCAAG TGAAATCATC TGAGAAATAT CGTGATGCTG ATACAAATAA 16980 

40 

ACAACAAGAG TATGATAATG CTATTACTGC AGCGAAAGCG ATTTTAAATA AATCGACAGG 17040 

TCCAAACACT GCGCAAAATG CAGTTGAAGC AGCATTACAA CGTGTTAATA ATGCGAAAGA 17100 

45 TGCATTGAAT GGTGATGCAA AATTAATTGC AG CTCAAAAC GCAGCGAAAC AACATTTAGG 17160 

TACTTTAACG CATATCACTA CAGCT CAACG TAATGATTTA ACAAATCAAA TTTCACAAGC 17220 

TACAAACTTA GCTGGTGTTG AATCTGTTAA ACAAAATGCG AATAGTTTAG ATGGTGCTAT 172 80 

SO GGGTAACTTA CAAACGGCTA TCAACGATAA GTCAGGAACA TTAGCGAGCC AAAACTTCTT 17340 

GGATGCTGAT GAGCAAAAAC GTAATGCATA CAATCAAGCT GTATCAGCAG CCGAAACCAT 17400 
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TGTTAATAAT 


GCGAAACATG 


CATTAAATGG 


TACGCAAAAC 


TTAAACAATG 


CGAAACAAGC 


17520 




AGCGATTACA 


GCAATCAATG 


GCGCATCTGA 


TTTAAATCAA 


AAACAAAAAG 


ATGCATTAAA 


17580 


5 


AGCACAAGCT 


AATGGTGCTC 


AACG CGTATC 


TAATGCACAA 


GATGTACAGC 


ACAATGCGAC 


17640 




TGAACTGAAC 


ACGGCAATGG 


GCACATTAAA 


ACATGCCATC 


GCAGATAAGA 


CGAATACGTT 


17700 




AGCAAGCAGT 


AAATATGTTA 


ATGC CGATAG 


CACTAAACAA 


AATGCTTACA 


CAACTAAAGT 


17760 


10 


TAC CAATGCT 


GAACATATTA 


TTAG CGGTAC 


GCCAACGGTT 


GTTACGACAC 


CTTCAGAAGT 


17820 




AACAGCTGCA 


GCTAATCAAG 


TAAACAGCGC 


GAAACAAGAA 


TTAAATGGTG 


ACGAAAGATT 


17880 


15 


ACGTGAAGCA 


AAACAAAACG 


CCAATACTGC 


TATTGATGCA 


TTAACACAAT 


TAAATACACC 


17940 


TCAAAAAGCT 


AAATTAAAAG 


AACAAGTGGG 


ACAAGCCAAT 


AGATTAGAAG 


ACGTACAAAC 


18000 




TGTTCAAACA 


AATGGACAAG 


CATTGAACAA 


TGCAATGAAA 


GGCTTAAGAG 


ATAGTATTGC 


18060 


20 


TAACG AAACA 


ACAGTCAAAA 


CAAGTCAAAA 


CTATACAGAC 


GCAAGTCCGA 


ATAACCAATC 


18120 




AACATATAAT 


AGCGCTGTGT 


CAAATGCGAA 


AGGTATCATT 


AATCAAACTA 


ACAATCCGAC 


18180 




TATGGATACT 


AGTGCGATTA 


CCCAAGCTAC 


AACACAAGTG 


AATAATGCTA 


AAAATGGTTT 


18240- 


25 


AAACGGTGCT 


GAAAACTTAA 


GAAATGCACA 


AAACACTGCT 


AAGCAAAACT 


TAAATACATT 


18300 




ATCACACTTA ACAAATAACC AAAAATCTGC CATCTCATCA CAAATTGATC 


GTGCAGGTCA 


18360 




TGTGAGTGAG 


GTAACTGCTA 


CTAAAAATGC 


AGCAACTGAG 


TTGAATACGC 


AAATGGGTAA 


18420 


30 


CTTGGAACAA 


GCTATCCATG 


ATCAAAACAC 


AGTTAAACAA 


AGTGTTAAAT 


TTACTGATGC 


18480 




AGATAAAGCT 


AAACGTGATG 


CGTATACAAA 


TGCGGTAAGC 


AGAGCTGAAG 


CAATTCTGAA 


18540 




TAAAACGCAA 


GGTGCAAATA 


CGTCTAAACA 


AGATGTTGAA 


GCGGCTATTC 


AAAATGTTTC 


18600 


35 


AAGTGCTAAA 


AATGCATTGA 


ATGGTGATCA 


AAACGTTACA 


AATGCGAAGA 


ATGCAGCTAA 


18660 




AAATGCATTA AATAACTTAA CGTCAATTAA 


TAATGCACAA 


AAACGTGACT 


TAACAACTAA 


18720 


40 


AATTGATCAA 


GCAACAACTG 


TAGCTGGTGT 


TGAAGCTGTA 


TCTAATACGA 


GTACACAATT 


18780 


GAAtACAGCG 


ATGGCTAACT 


TGCAAAATGG 


TATTAATGAT 


AAAACAAATA 


CACTAGCAAG 


18840 




TGAAAACTAT 


CATGATGCTG 


ATTCAGATAA 


GAAAACTGCT 


TATACTCAAG 


CCGTTACGAA 


18900 


45 


CGCAGAAAAT 


ATTTTAAATA 


AAAATAGTGG 


ATCAAATTTA 


GACAAAACTG 


CCGTTGAAAA 


18960 




CGCGTTGTCA 


CAAGTTGCTA 


ATGCGAAAGG 


TGCCCTAAAT 


GGTAACCATA 


ATTTAGAGCA 


19020 




AGCTAAATCA 


AATGCAAACA 


CTACTATAAA 


CGGACTTCAA 


CATTTAACAA 


CTGCTCAAAA 


19080 


50 


AGATAAATTG 


AAACAACAAG 


TGCAACAAGC 


ACAAAATGTT 


GCAGGTGTAG 


i ATACTGTTAA 


19140 




ATCAAGTGCC 


AACACATTAA 


ATGGTGCTAT 


' GGGTACGTTA 


> AGAAATAGCA 


k TACAAGATAA 


19200 
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TAACAATGCT 
TGCTAATGCA 
TACACATAAT 
CTTAAATAAA 
AAATGTAACA 
ACATGGTATT 
AAGTAAGAAA 
AACAGGTTCA 
GAAAGATGCA 
CTTAGGCACT 
TCAAGCGACG 
CGCAATGAAT 
TTATCTTGAT 
AGGCATTTTA 
AAATGCAGTT 
AACTTCAGCA 
CTTGAAGCAT 
AGGTAATACG 
GACGAAAACA 
TGCTGTAAAT 
TGCGATTAAT 
AAACTTAGCT 
CCAAAAACAA 
AAATAACGTT 
TATTGCTGAT 
ACGTCAAGCG 
TGCGACATTA 
AG CATTAAAT 
CGGCTTAGCA 



GTTGATAGTG 
ATTAACCAAA 
TTAACGCAAG 
GCGCAAAAAG 
AGTATCCAAC 
GATGATGAAA 
ACTGCTTATG 
AATTCAGATA 
TTGAATGGTG 
TTAAACCATA 
ACTGTTGATG 
AGCTTACAAG 
GCGGATGAAT 
AATAAACAAA 
ACAAGAGCGA 
ACAAATACGA 
CAAGTTGAaC 
TTAAATACTG 
AGTCAAAATT 
AATGCAAATG 
GGCATGGCAA 
CAAGCTAAAA 
AAAGATGCAT 
CAACACACTG 
AAAGAAAGAA 
TATGATTCAA 
ACAGTCAATG 
GGTGATAACA 
CAATTGAATA 



CTAATGGTGT 
TCGCTACACA 
CGAAACAAAC 
ATGCGTTAAA 
AAACTGCAAA 
ATGCAACAAA 
ATCAAGCTGT 
AAGCAGCAGT 
ATGCAAAACT 
TTACGAATGC 
GCGTTAATAC 
GTTCAATCAA 
CAAAACGAAA 
CTGGTGGTAA 
AAGcGgCTTT 
TTGATGGTTT 
AAGCGCAAAA 
CCATGGGTGC 
ATCTTGATGC 
GTGTTATTAA 
ATCAAGTCAA 
CAAATGCGAC 
TAAAAACACA 
CAACTGAATT 
CAAAAGCAAG 
AAGTGACTAA 
ACGTAAATAG 
ACTTACGTGT 
ATGCACAAAA 



CATTAATGCA 
AGTGACATCA 
AGCAACAAAT 
AGCACAAGTT 
TGAACTTAAT 
ACAAACTCAA 
AG CTG CTGCG 
TGACCGTGCA 
GGCAGAAGCG 
ACAACGTACT 
TGTAAAAACA 
TGATAAAGAT 
TGCATATACG 
CACATCTAAA 
AAATGGTGCT 
AC CTAACTTA 
TGTAGCAGGT 
ATTACGTACA 
AT CTGACAGC 
TGCAACGAAC 
TACAACAAAA 
GAACACAATT 
AGTTAACAAT 
GAACAGTGCG 
CGGTAATTAT 
CGCTGAAAAT 
TGCGGCATCA 
AGCGAAAGAG 
AGCAAAATTA 



ACAAGCAATC 

ACGAAAAATG 

GCCATCGATG 

ACAAGTGCGC 

ACAGCTATGG 

AAATATCGTG 

AAAGCAATTT 

TTACAACAAG 

AAAGCGGCAG 

GACTTAGAAG 

AATGCCAATA 

GCGACATTAA 

CAAGCTGTCA 

GCAGACGTTG 

GACAACTTAA 

ACACAATTAC 

GTAAATGGTG 

AGTATCCAAA 

AACAAAAATA 

AATCCAAATA 

GCAGCGTTAA 

AACAACGCAC 

GCACAACGTG . 

ATGACAGCAC 

GTCAATGCTG 

ATCATTAGTG 

CAAGTCAATG 

CATGCCAACA 

AAAGAACAAG 



CAAATATGGA 
CATTAGATGG 
GTGCTACTAA 
AACGTGTTGC 
GTCAATTACA 
ACGcTGAACA 
TAAATAAACA 
TAACAAGTAC 
CTAAACAAAA 
GCCAAATCAA 
CATTAGACGG 
GAAATCAAAA 
CAGCGGCTGA 
AT AATG CATT 
GAAATGCGAA 
AAAAAGACAA 
TTAAAGATAA 
ATGATAATAC 
ATTACAATAC 
TGGATGCTAA 
ATGGTGCACA 
ATGACTTAAA 
TATcTGATGC 
TTAAAGCAGC 
ATCAAGAAAA 
GTACACCGAA 
CGGCTAAAAC 
ATACAATTGA 
TTCAAAGTGC 



19320 

19380 

19440 

19500 

19560 

19620 

196B0 

19740 

19800 

19860 

19920 

19980 

20040 

20100 

20160 

20220 

20280 

20340 - 

20400 : 

20460- 

20520 

20580 

20640 

20700 

20760 

20820 

20880 

20940 

21000 
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GAAAGGCTTA 


AGAGATAGTA 


TTGCGAATGA 


AGCAACAATT 


AAAGCAGGTC 


AAAACTACAC 


21120 




TGACGCAAGT 


CCAAATAATC 


GTAACGAGTA 


CGACAGTGCA 


GTTACTGCAG 


CAAAAGCAAT 


21180 


5 


CATTAATCAA 


ACATCGAACC 


CAACGATGGA 


ACCAAATACT 


ATTACGCAAG 


TAACATCACA 


21240 




AGTGACAACT 


AAAGAACAGG 


CATTAAATGG 


TGCGCGAAAC 


TTAGCTCAAG 


CTAAGACAAC 


21300 




TGCGAAAAAC 


AACTTGAATA 


ACTTAACATC 


AATTAACAAT 


GCACAAAAAG 


ATGCGTTAAC 


21360 


10 


GCGTAgcATT GATGGTGCAA CAACAGTAGC 


TGGTGTAAAT 


CAAGAAACTG 


CAAAAGCAAC 


21420 




AGAATTAAAT 


AACG CAATGC 


ATAGTTTACA 


AAATGGTATC 


AATGATGAGA 


CACAAACAAA 


21480 




ACAAACTCAG 


AAATACCTAG 


ATGCAGAGCC 


AAGTAAGAAA 


TCAGCTTATG 


ATCAAGCAGT 


21540 


AAATGCAGCG 


AAAGCAATTT 


TAACAAAAGC 


TAGTGGTCAA 


AATGTAGACA 


AAGCAGCAGT 


21600 




TGAACAAGCA 


TTGCAAAATG 


TGAACAGTAC 


GAAGACGGCG 


TTGAACGGTG 


ATGCGAAATT 


21660 


20 


AAATGAAGCT 


AAAGCAGCTG 


CGAAACAAAC 


GTTAGGTACA 


TTAACACACA 


TTAATAATGC 


21720 




ACAACGTACA 


GCGTTAGACA 


ATGAAATTAC 


ACAAGCAACA 


AATGTTGAAG 


GTGTTAATAC 


21780 




AGTTAAAGCC 


AAAGCGCAAC 


AATTAGATGG 


TGCTATGGGT 


CAATTAGAAA 


CATCAATTCG 


21840 


25 


TGATAAAGAC 


ACGACGTTAC 


AAAGTCAAAA 


TTATCAAGAT 


GCTGATGATG 


CTAAACGAAC 


21900 




TGCTTATTCT 


CAAGCAGTAA 


ATGCAGCAGC 


AACTATTTTA 


AATAAAACAg 


CTGGCGGTAA 


21960 




TACACCTAAA 


GCAGATGTTG 


AAAGAGCAAT 


GCAAGCTGTT 


ACACAAGCAA 


ATACTG CATT 


22020 


30 


AAACGGTATT 


CAmAACTTAG 


ATCGTGCGAA 


ACArGCTGCT 


AACACAGCGA 


TTACAAATGC 


22080 




TTCGGACTTA 


AATACAAAAC 


mAAAAGAAGC 


ATTAAAAgCA 


CAAGTAACAA 


GTGCAGGACG 


22140 




TGTATCTGCA 


GCAAATGGTG 


TTGAACATAC 


TGCGACTGAA 


TTAAATACTG 


CGATGACAGC 


22200 


35 


TTTAAAGCGT 


GCCATTGCTG 


ATAAAGCTGA 


GACAAAAGCT 


AGTGGTAACT 


ATGTCAATGC 


22260 




TGATCJCGAAT 


AAACGTCAAG 


CATATGATGA 


AAAAGTTACA 


GCTGCCGAAA 


ATATCGTTAG 


22320 




TGGTACACCA 


ACACCAACGT 


TAACACCAGC 


AGATGTTACA 


AATGCAGCAA 


CGCAAGTAAC 


22380 


GAATGCTAAG 


ACGCAGTTAA 


ACGGTAATCA 


TAATTTAGAA 


GTAGCGAAAC 


AAAATGCTAA 


22440 




CACTGCAATT 


GATGGTTTAA 


CTTCTTTAAA 


TGGTCCGCAA 


AAAGCAAAAC 


TTAAAGAACA 


22500 


45 


AGTGGGTCAA 


GCGACGACGT 


TGCCAAATGT 


TCAAACTGTT 


CGTGATAATG 


CACAAACATT 


22560 




AAACACTGCA 


ATGAAAGGTC 


TACGAGATAG 


CATTGCGAAT 


GAAGCAACGA 


TTAAAGCAGG 


22620 




TCAAAACTAC 


ACAGATGCAA 


GTGAAAACAA 


ACAAACTGAC 


TACAACAGTG 


CAGTCACTGC 


22680 


SO 


AGCAAAAGCA 


ATCATTGGTC 


AAACAACTAG 


TCCATCAATG 


AATGCGCAAG 


AAATTAATCA 


22740 




AGCGAAAGAC 


CAAGTGACAG 


CTAAACAACA 


AGCGTTAAAC 


GGTCAAGAAA 


ACTTAAGAAC 


22800 



55 



BNSDOC1D <EP 0786519A2_I_> 



464 



EP0 786 519 A2 





AG ATG CAGTG 


f AAACGTCAAA TCGAAGGTGC 


■ AACGCATGTT 


' AATGAAGTAA 




22920 




AAATAATGCG 


• GATGCaTTAA 


> ATACAGCTAT 


GACGAACTTG 


AAAAATGGTA 


. TTCAAGATPA 


22 980 


5 


GAATACGATT 


AAGCAAGGTG 


TTAACTTCAC 


TGATGCCGAC 


GAAGCGAAAC 


GTAATG CAT A 


a J U4 u 




TACAAATGCA 


GTGACGCAAG 


CTGAACAAAT 


TTTAAATAAA 


GCACAAGGTC 


CAAATACTTP 


2 J xUU 


10 


AAAAGACGGT 


GTCGAAACTG 


CGTTAGAaAA 


TGTACAACGT 


GCTAAAAACG 


AATTGAAPrTf; 


Oil Cft 


TAATCAAAAT 


GTTGCGAACG 


CTAAGACAAC 


TGCGAAAAAT 


GCATTGAATA 


ACCTAACATf* 






AATTAATAAT 


GCACAAAAAG 


AAGCATTGAA 


ATCACAAATT 


GAAGGTGCGA 


CAAnvnTTfrr 

WVW.AVj A A 




15 


AGGTGTAAAT 


CAAGTGTCTA 


CAACGGCATC 


TGAATTAAAT 


ACAGCAATGA 


G CAACTT A PA 

VJV^lAWi X/\l*n 


z J 340 




AAATGGTATT 


AATGATGAAG 


CAGCTACAAA 


AGCAGCGCTT 


AATGGTACTf? 


' AAA A n^TTC A 


T3 j< An 
234 00 




AAAAGCTAAA 


CAACACGCAA 


ATACAGCAAT 


TGACGGTTTA 


AG C CATTT A A 






20 


AAAAGAGGCA 


TTAAAACAAT 


TGGTACAACA 


ATCGACTACT 


GTTGCAGAAG 




Oft 




TGAGCAAAAA 


GCAAACAATG 


TTGATGCAGG 


AATGGACAAA 


TTACGTCAAA 

a a nvvj A wvin 




23 580 




TAATGCGACA 


ACAAAACAAA 


ACCAAAATTA 


TACTGATGCA 


AGTCAGAATA 


A A A AIV2 A 


o "1 >* rt 


25 


GTACAATAAT 


GCTGTCACAA 


CTGCACAAGG 


TATTATTGAT 


CAAACTACAA 


RTPf* A 21 f^TTT 

O X LLiWil.. 1X1 


O ^ T A A " 




AGATCCGACT 


GTTATCAATC 


AAGCTGCTGG 


ACAAGTAAGC 


ACAACTAAAA 


ATfiTATT^ A A 
InnA 


23760 




TGGTAATGAA 


AACCTAGAGG 


CAGCGAAACA 


ACAAGCGTCA 


CAATCATTAG 


Vj 1 1 \— r\ 1 1 


23820 ■ 


30 


TAACTTAAAT 


AATGCGCAAA 


AACAAACAGT 


TACTGATCAA 


ATTAATGGCG 


CGCATAPTGT 


23 ooO 




TGATGAAGCA 


AATCAAATTA 


AGCAAAATGC 


GCAAAACTTA 


AATACAGCGA 


TGGGTAA ("T" 1 1 

A V7V7VJ X .AAVw X X 


o *a qa n < *- 
^ J 34 U 


35 


GAAACAAGCG 


ATAGcTGACA 


AAGATGCTAC 


GAAAGCGACA 


GTTAACTTCA 


CTGATGCAGA 


5 A ft ft ft 'T 


TCAAGCAAAA 


CAACAAGCAT 


ATAAC aCTGC 


TGTTACAAAT 


GCTGAAAATA 


TCATTTt^LAAA 

A A A A 






AGCTAATGGC 


GGCAATGCAA 


CACAAGCTGA 


AGTTGAACAA 


GCAATCAAAC 


AAGTTAATnr' 


1 ^ ft 


40 


TGCAAAACAA 


GCATTAAATG 


GTAATGCCAA 


CGTTCAACAT 


GCAAAAGACG 


AAGCAACAGC 


24 ISO 




ATTAATTAAT 


AGCTCTAATG ACCTTAACCA AGCACAAAAA GACGCATTAA 


AACAACAAGT 


24240 




TCAAAATGCA 


ACTACTGTAG 


CTGGTGTAAA 


CAATGTTAAA 


CAAACAGCAC 


AAGAGTTAAA 


04 inn 




CAATGCTATG 


ACACAATTAA 


AACAAGGCAT 


TGCAGATAAA 


GAACAAACAA 


AAGCTGATGG 






A ™»W AAA w A W 


AATGCAGATC 


CTGATAAGCA 


AAA TG CAT AT 


AATCAAGCAG 


TAGCGAAAGC 


24420 




TGAAGCATTA 


ATTAGTGctA 


CGCCTGATGT 


TGTCGTTACA 


CCTAGCGAAA 


TTACTGCAGC 


24480 


50 


GTTAAATAAA 


GTTACGCAAG 


CTAAAAATGA 


TTTAAATGGT 


AATACAAACT 


TAGCAACGGC 


24540 




GAAACAAAAT 


GTTCAACATG 


CTATTGATCA 


ATTGCCAAAC 


TTAAACCAAG 


CGCAACGTGA 


24600 
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AGCGGCGACA 
ACAAATTAAA 
TAATGCAGTA 
AAATACAATT 
TCAAAAATTA 
AAATGATGCT 
AGTTAATAAT 
TGCATTACAA 
TAAGAAAGAT 
TGGCAGCAAT 
ACAAGCATTG 
TACAAATGCA 
AGACCAAACT 
CAACAATGCG 
TCCACAACAA 
TAACCATAAC 
CTTAAATCAA 
AGGTGTTAAT 
ACAACAAATT 
AGACAAACAA 
ACCAACACCT 
GAAAGATGCA 
TCTTGATACG 
TCAAGCACAA 
aGCaATGAGT 
CTATCATGAT 
AGGTATTATC 
AACTCAAGTG 
GCAAAATGCT 



ACGCTTAATG 
GGTAGCGAGA 
ACAAAAGCAG 
CAACAAGCAT 
GCTGATGCCA 
CAAAAACAAG 
GTTAAGCAAA 
GATAAAACTG 
GCTTATACTA 
GCAAGTCAAA 
AATGGTAATG 
AATGATTTAA 
AAAGCAAATG 
GTAGCACATG 
GTGGCTCAAG 
TTACAAGTTG 
CCACAAAAAA 
GCTATTAAGC 
CAAGCGAACA 
CAAGCATATA 
GTATTGACGC 
TTAAACGGTG 
TTACGCGATT 
GCGTTAGCTA 
AACTTGAAAC 
GCTGATG CCG 
AATCAAACGA 
ACTGATGCTA 
AAAGATGCCG 



ACGCGATGAC 
ACTATCACGA 
AAGAATTGTT 
TAACTAAAGT 
AACAAGATGC 
CGCTAACAAC 
ATGCTCAAAA 
AGACATTAAA 
ATGCGGTTTC 
CTGAAGTGGA 
ACAATGTACA 
ATCAAG CAAT 
GTAACTTTGT 
CTGAACAAAT 
CGTTACAACA 
CTAAAGACAA 
CAGCATTAAA 
AAAATGCTGA 
GTCAAGTACC 
ACAATGCGGC 
CTGATACAGT 
ATGAAAAATT 
TAAATCAACC 
CAGTTGAACA 
aAGGTATTGC 
ATAAGCAAAC 
CAAATCCAAC 
AAAATGGCTT 
TAAGTGGGAT 



ACAATTGAAA 
TGCTGATACT 
AAAACAAACA 
GAATGACACA 
TAAGACAACA 
TCAAGTTGAA 
TCTGAATAAT 
TAGCATTAAC 
ACATGCAGAA 
ACAAGCGATG 
ACGTGCAAAA 
GACACAATTG 
CAATGCTGAT 
AATTAGTGGT 
AGTGAATCaA 
TGCAAATACA 
AGACCAAGTG 
TGCGTTAAAT 
ACAGTCAGTT 
TAACCAAGCG 
AACACAAGCA 
AGCACAAGCG 
ACAACGTGAT 
AACTAAACAA 
aAACAAAGAT 
AGCATATACA 
GCTTAACCCA 
AAACGGTGAA 
GACGCATTTA 



CAAGGTATTG 
GACAAGCAAA 
ACAAATCCAA 
AATCAAGCAC 
CTTGGTACAC 
CAAGCACCAG 
GCTATGACTA 
TTTACTGATG 
GGTATTTTAT 
CAACGTGTGA 
GATGCAGCGA 
AAACAAGGTA 
ACTGATAAGC 
ACACCAAATG 
GCTAAGGGTG 
GCCATTGATC 
TCGCATGCAG 
AATGcAATGG 
GACTTTACAC 
CAACAAATCG 
GTGACAACTA 
AAACAAGAAG 
GCATTACGTA 
AATGCACAAA 
ACTGTCAAAG 
AATGCAGTGT 
GATGAAATAA 
GCTAAATTGG 
AACGATGCTC 



CGAATAAAGC 
CAG CAT ATG A 
CAATGGATCC 
TTAACGGTAA 
TAGATCATTT 
ATATTGCAAC 
ACTTAAACAA 
CAGATCAAGC 
CTAAAGCAAA 
ACGAAGCGAA 
AACAAGTGAT 
TTGCAGATAA 
AAAATGCTTA 
CAAACGTGGA 
ATTTAAACGG 
AGTTACCAAA 
AACTTGTTAC 
GTACATTGAA 
AAGCGGATCA 
CAAATGGCAT 
TGAATCAAGC 
CTTTAGCAAA 
ACCAAATCAA 
ATGTGAATAC 
CAAGTGAGAA 
CTCAAGCGGA 
CACGTGCATT 
CAACTGAAAA 
AAAAACAAGC 



24720 
24780 
24840 
24900 
24960 
25020 
25080 
25140 
25200 
25260 
25320 
25380 
25440 
25500 
25560 
25620 
25680 
25740 
25800 
25860 
25920 
25980 
26040 
26100 
26160 
26220 
26280 
26340 
26400 
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AGCAACGAGC 

AACATTAGCG 

GGCAGTAGCA 

ACAAGTTGAA 

CAATTTGGCA 

TGATGCACAA 

CACTACGATT 

TGTTGCGGAT 

ACAAAATGAC 

GCCTACGATG 

TGCATTAGAT 

TCAATTAGAT 

ATCTGATATT 

GGGTAACTTA 

CAATGCTGAT 

GATTAACAAA 

AGTTCAAACA 

TGCGACGCAA 

AGACCAAGTT 

TACGCTTAAC 

AGCAAATAGC 

TCAAGCCGCA 

TCAAGCAGCG 

AAATGATAAA 

AAAACATATG 

TTTGACTGAA 

CAAAGATGCA 

CTATGATGAA 

CAATAAAGGT 



CTAGATCAAG 
GACGGTAATT 
AAAGCTGAAG 
AGCATCACTA 
AATGCAAAAC 
AAACAATCAT 
AATCAAAAAG 
AATCAAACGA 
TATAACCAAG 
AATCCAGATG 
GGTGATGAAA 
CATTTGAATA 
GCTG CAGTT A 
ATTAATGCGA 
ACTGATAAAC 
CAAACTGGTC 
ACACTTCAAG 
GCAATTGATG 
ACAGCTGCAA 
CAAGCAATGC 
AAATATATCA 
AATAATATTA 
ACAACTGTGA 
GATCATGCTA 
GAAGATACGT 
GCACAAGCAT 
ACACGTGCGA 
GCAGTTCAAA 
AATGTATCAA 



CAATGGATCA ATTATCACAA 
ACTTAAATGC AGATCCTGAC 
CATTATTGAA TAAACAAAGT 
ATGAAGTGAA CGCAGCGAAA 
AACAAGCAAA ACAACAATTG 
TTGAAAGTCA AATTACACAA 
CACAAACGTT AGATCATGCG 
CATTAGCGTC TGAAGATTAT 
CTGTAACAGC TGCTAATAAT 
ATGTTAATGG TGCAACGACA 
ACCTTGCAGC AGCTAAACAA 
ATGCGCAAAA GCAACAGTTA 
ATGGTCACAA ACAAACAGCA 
TTGCAGATCA TCAAGCCGTT 
AAACTGCTTA TAATACAGCG 
AAAATGCGAA CCAAACAGAA 
CGTTAAATGG AGACCATAAT 
CTTTAACAAG CTTAAATGAT 
CTTTAGTAAC TG CAGTT CAT 
ATGGTTTAAG ACAGAGCATT 
ACGAAGATCA ACCAGAGCAA 
TCAATGAACA AACTGCAACA 
ATACAACGAA AGCAGCATTA 
AGCAAACGGT TAGTCAATTA 
TAATTGATAG TGAAACAACT 
TAGATCAACT TATGGATGCA 
GCAGTGCATA TGTCAATGCA 
ATGCTGAGTC TATCATTGCA 
GTGCGACTCA AGCAGTAATA 



GCTATTAATG 
AAACAAAATG 
GGTACTAATG 
CAAGCATTAA 
GCGAACTTAA 
GCGCCACTTG 
ATGGAATTAT 
CATGATGCAA 
ATAATTAATC 
CAAGTGAATA 
CAAGCAAACA 
CAATCACAAA 
GAATCTTTAA 
GAACAACGTG 
GTAAATGAAG 
GTAGAACAAG 
TTACAAGTTG 
CCTCAAAAAA 
CAAATTGAAC 
CAAGATAACG 
CAAAACTATG 
TTAGATAATA 
CATGGTGATC 
GCACATCTAA 
AGAACAGCAG 
TTACAACAAA 
GAACCGAATA 
GGATTAAATA 
TCATCTAAAA 



ATAAAGCTCA 
CGTATAAACA 
AAGTACAAGC 
ATGGTAATGA 
CACACTTAAA 
TTACAGATGT 
TAAGAAATAG 
CTGCGCAAAG 
AAACTACATC 
ATACGAAAGT 
ACAGACTTGA 
TTACGCAATC 
ATACTGCGAT 
GTAACTTCAT 
CAGCAGCAAT 
CTATTACTAA 
CTAAAACAAA 
CAGCATTAAA 
AAAATGCGAA 
CAGCAACTAA 
ATCAAGCTGT 
ATGCGATTAA 
TGAAGTTACA 
ACAATGCACA 
TTAAGCAAGA 
GTATTGCTGA 
AAAAACAATC 
ATCCAACTAT 
ATG CATTAGA 



26520 
26580 
26640 
26700 
26760 
26820 
26880 
26940 
27000 
27060 
27120 
27180 
27240 
27300- 
27360 
27420' 
27480 
27540V 
27600- 7 
27660- 
27720 
27780 
* 27840 
27900 
27960 
28020 
28080 
28140 
28200 
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TCAATTAACA CCAGCTCAAC AACAAGCGCT AGAAAATCAA ATTAATAATG CAACAACTCG 2 8320 

TGATAAAGTG GCTGAAATCA TTGCACAAGC GCAAgCATtA AATGAAGCGA TGAAAGCATT 28380 

AAAAGAAAGT ATTAAGGATC AACCACAAAC TGAAGCAAGT AGTAAATTTA TTAACGAGGA 28440 

TCAAGCGCAA AAAGATGCTT ATACGCAAGC AGTACAACAC GCGAAAGATT TGATTAACAA 28500 

AACAACTGAT CCTACATTAG CTAAATCAAT CATTGATCAA GCGACACAGG CAGTGACAGA 28560 

TGCTAAAAAC AATTTACATG GTGATCAAAA ACTAGCTCAA GATAAGCAAC GTG CAACAGA 28620 

AACGTTAAAT AACTTGTCTA ACTTGAATAC ACCACAACGT CAAGCACTTG AAAATCAAAT 28680 

TAATAATGCA GCAACTCGTG GCGAAGTAGC ACAAAAATTA ACTGAAGCAC AAGCACTTAA 28740 

CCAAGCAATG GAAGCTTTAC GT AAT AG CAT TCAAGATCAA CAGCAAACGG AAGCGGGTAG 2 8800 

CAAGTTTATC AATGAAGATA AaCCaCmAAA AGrTGCTTAC CAAGCAGCAG TTCAAAATGC 28860 

AAAAGATTTA ATTAATCAAA CTAACAATCC AACG CTTG AT AAAGCACAAG TTGAACAATT 2 8920 

GACACAAGCT GTTAACCAAG CTAAAGATAA CCTACACGGT GATCAAAAAC TTGCAGACGA 28980 

TAAACAACAT GCGGTTACTG ATTTAAATCA ATTAAATGGT TTGAATAATC CGCAACGTCA 29040 

AGCACTTGAA AGCCAAATAA ACAACGCAGC AACTCGTGGC GAAGTAGCAC AAAAATTAGC 29100 

TGAAGCAAAA GCGCTTGATC AAGCAATGCA AGCATTACGT AATAGTATTC AAGATCAACA 29160 

ACAAACAGAA TCTGGTAGCA AGTTTATCAA TGAAGATAAA CCGCAAAAAG ATG CTTACCA 29220 

AGCAGCAGTT CAAAATGCAA AAGATTTAAT TAACCAAACA GGTAATCCAA CACTCGACAA 29280 

ATCACAAGTA GAACAATTGA CACAAG CAGT AACAACTGCA AAAGATAATC TACATGGTGA 29340 

TCAAAAACTT GCTCGTGATC AACAACAAGC AGTAACAACT GTAAATGCAT TGCCAAACTT 29400 

AAATCATGCA CAACAACAAG CATTAACTGA TGCTATAAAT GCAGCGCCTA CAAGAACAGA 29460 

GGTTGCACAA CATGTTCAAA CTGCTACTGA ACTTGATCAC GCGATGGAAA CATTGAAAAA 29520 

TAAAGTTGAT CAAGTGAATA CAGATAAGGC TCAACCAAAT TACACTGAAG CGTCAAGTGA 29530 

TAAAAAAGAA GCAGTAGATC AAGCGTTACA AGCTGCAGAA AGCATTACAG ATCCAACTAA 29640 

TGGTTCAAAT GCGAATAAAG ACG CTGTAGA CCAAGTATTA ACTAAGCTTC AAGAAAAAGA 29700 
AAATGAGTTA, AATGGTAATG AGAGAGTCGC TGAAGCTAAA ACACAAGCGA AACAAACTAT 29760 
TGACCAATTA ACACATTTAA ATGCTGATCA AATTGCAACT GCTAAACAAA ACATTGATCA 29820 
AGCGACGAAA CTTCAACCAA TTGCTGAATT AGTAGATCAA GCAACGCAAT TGAATCAATC 29880 
TATGGATCAA TTACAACAAG CAGTTAATGA ACATGCTAAC GTTGAGCAAA CTGTAGATTA 2994 0 
CACACAAGCA GATTCAGATA AACAAAATGC TTATAAACAA GCTATTGCTG ATGCTGAAAA 30000 
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TGCAAAACAA GCATTAAATG GTGATGAACG TGTAGCACTT GCTAAAACAA ATGGTAAACA 30120 

TGACATCGAC CAATTGAATG CATTAAACAA TGCTCAACAA GATGGATTTA AAGGTCGCAT 30180 

CGATCAATCA AACGATTTAA ATCAAATCCA ACAAATTGTA GATGAGGCTA AGGCACTTAA 30240 

TCGTGCAATG GATCAATTGT CACAAGAAAT CACTGACAAT GAAGGACGCA CGAAAGGTAG 3 0300 

CACGAACTAT GTCAATGCAG ATACACAAGT CAAACAAGTA TATGATGAAA CGGTTGATAA 3 0360 

AGCGAAACAA GCACTTGATA AATCGACTGG TCAAAACTTA ACTGCAAAAC AAGTTATCAA 3 0420 

ATTAAATGAT GCAGTCACTG CAGCTAAGAA AGCATTAAAT GGTGAAGAAA GACTTAATAA 30480 

TCGTAAAGCT GAAGCATTAC AAAGATTGGA TCAATTAACA CATCTAAACA ATGCTCAAAG 3 0540 

ACAATTAGCA ATCCAACAAA TTAATAATGC TGAAACGCTA AATAAAGCAT CTCGAG CAAT 30600 

T AAT AG AG CA ACTAAATTAG ATAATGCAAT GGGTTCAGTA CAACAATATA TTGACGAACA 30660 

GCACCTTGGT GTTATCAGCA GCACAAATTA CATCAATGCA GATGACAATT TGAAAGCAAA 3 0720 

TTATGATAAT GCAATTGCGA ATGCAGCACA TGAGTTAGAT AAAGTGGAAG GTAATGCAAT 30780 

TGCaAAAGCT GAAGCAGAGC AATTGAAACA AAAT ATT AT C GATGCTCAAA ATGCATTAAA 30840 

25 TGGAGACCAA AACCTTGCAA ATGCCAAAGA TAAAGCAAAT GCGTTTGTTA ATTCGTTAAA 3 0900 

TGGATTAAAT CAACAGCAAC AAGATCTTGC ACATAAAGCA ATTAACAATG CCGATACTGT 30960 

ATCAGATGTA ACAGATATTG TTAATAATCA AATTGACTTA AATGATGCAA TGGAAACATT 31020 

30 GAAACATTTA GTTGACAATG AAATTCCAAA TG C AG AG CAA ACTGTCAATT ACCAAAACGC 310 80 

TGACGATAAT GCTAAA 31096 
(2) INFORMATION FOR SEQ ID NO: 60: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2243 base pairs 
<B) TYPE: nucleic acid 
<C) STRAND EON ESS : double 
(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

ATGACAGAAT GGGAGCGAGG ACTTAGAATG TTTCCTAAAT CAGGTTTATT AAATTTTGAG 60 

TT AG CG AT AG mAAATCGTTC ATTAAATGAT GATGAAAAAG CATTAAAATA TGTGCGTAAA 120 

GCATTAAATG CAGACCCTAA AAATACAGAT TATATTAACT TAGAAAAAGA GTTGACTAAA 180 

SO TCAAATGAGT CGAAAAATAA ATAACTTTTA TGATGTACAA CAGTTATTGA AAAGTTACGG 24 0 

ATTTCTAATA TATTTTAAAA ATCCAGAAGA TATGTACGAA ATGATTCAAC AGGAGATTTC 3 00 
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TAATCAGAGA AGGAATGAAC AGAAATGACA AAAATTATTT TAG CAGCTGA TGTAGGCGGG 420 

ACGACTTGTA AATTAGGTAT TTTCACACCT GAATTAGAAC AATTACATAA ATGGTCTATT 4 80 

CACACTGATA CATCTGATAG TACAGGATAT ACACTTTTGA AAGGAATTTA TGATTCGTTT 54 0 

GTTGAAAAAG TAAATGAAAA TAATTATAAT TTTTCAAATG TACTTGGCGT AGGTATTGGT 600 

GTACCAGGTC CTGTTGACTT TGAAAAAGGT ACAGTAAATG GAGCAGTAAA CTTATATTGG 66 0 

CCAGAAAAAG TTAATGTACG TGAGATTTTT GAACAATTCG TTGATTGTCC AGTGTATGTA 72 0 

GATAATGATG CTAACATAGC TGCTTTAGGG GaGAAACACA AAGGTGCTGG TGAAGGTGCC 78 0 

GATGATGTTG TTGCCATCAC ACTTGGTACA GGTCTAGGTG GAGGAATTAT TTCCAAATGG 84 0 

TGAAATCGTA CATGGTCATA ATGGCTCtGG CGCAGAAATA GGTCATTTTA GAgCAGACTT 900 

CgATCAACGA TTT aAATGT A ATTGTGGTCG TTCTGGATGT ATTGAAACAG TTGCTTCaGC 960 

20 GACAGGCGTT GTTAACTTAG TT AACTTC t A CTATCCGAAG TTGACGTTTA GATCTTCTAT 102 0 

ATTAGAATTG ATTAAAGAAA ATAAGGTtAC aGCAAAAGCT GTTTTTGATG CGGCAAAAGC 108 0 

TGGTGACCAA TTCTGTATTT TCATTACTGA AAAGGTTGCA AACTATATTG GATATTTATG 114 0 

25 TAGTATTATT AGTGTTACAA GTAATCCGAA ATATATCGTT CTAGGTGGAG GAATGTCTAC 1200 

TGCAGGACCT ATTTTAATTG AAAATATTAA AACAGAATAT CATAATTTAA CATTTGCACC 1260 

TGCTCAATTT GAAACTGAAA TTGTACAAGC GAAATTAGGT AATGATGCAG GTATTACAGG 1320 

AGCAGCAGGA TTAATCAAGA CCTATGTATT AGATAAAGAG GGGGTAAAAT AATGGCTATT 1380 

GTTGATGTGG TTGTTATTCC AGTTGGAACG GAAGGTCCGA GTGTTAGTAA AT AT ATTG CA 144 0 

GATATTCAGA AAAAACTTCA AGAATATAAA GCAATGGGTA AAATTGATTT TCAATTAACA 1500 

CCAATGAATA CTCTAATTGA AGGTGAATTA AGCGATGTAT TAGAAGTTGT GCAAGTGATA 1560 

CATGAATTAC CTTTTGATAA AGGTTTAAGT AGAGTTTGTA CAAATATCCG TATTGATGAC 1620 

CGACG AGAC A AATCT AG AAA AATGAATGAT AAACTAACAT CAGTACAAAA ACATTTAGAA 1680 

AATAGTGGTG AAAACCTATG AGGATTTCAA GCTTAACTTT AGGCTTAGTT GATACTAATA 1740 

CGTATTTCAT CGAAAATGAC AAAGCTGTTA TTCTGATTGA CCCTTCAGGT GAAAGTGAAA 1800 

45 AAATTATTAA AAAATTAAAC CAAATAAATA AACCGTTAAA AGCTATTTTA TTAACACATG 1860 

CACACTTTGA TCATATCGGA GCAGTCGATG ATATAGTTGA TCGATTCGAT GTCCCGGTTT 1920 

ATATGCATGA AGCAGAGTTT GATTTTCTAA AAGATCCCGT TAAAAATGGG GCAGATAAAT 1980 

SO TTAAGCAATA TGGATTACCA ATTATTACAA GTAAGGTAAC TCCTGAAAAG TTAAmCGAAG 204 0 

GTAGCACAGA AATAGAAGGA TTTAAGTTnT nAyrTGTaCA CACACCTGGA CATTCACCAG 2100 
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GAATCGGACG TACAGATTTA TATAAAGGTG ATTATGAAAC GCTAGTTGAT TCTATTCAAG 
ATAAAATATT TGAATTAGAA GGC 
(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8009 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

15 





TTGGnATCAT 


tyAcgGTAAA 


AAGAATAAaG 


CAAGATT t AT 


TTCATTAGTA 


CTAATTTGTG 


60 




CAATGTTTGC 


AATTTGTTGG 


GTTGCATATA 


TTCAATGGGA 


GTCTACAATC 


flfTTPA i i"i' , r& 


X A V 


20 


CACAATCTAT 


TAATATTTCa 


ATGGCACAAT 


ATAGTGTTTT 


ATGGACAATT 


AACGGAATAA 


180 




TGATTTTAGT 


AG CACAACCA 


TTAATTAAAC 


CGATTCTCTA 


TCTGTTAAAA 


GGAAACTTAA 


24 0 




AGAAGCAAAT 


GTTTGTCGGC 


ATCATCATTT 


TTATGTTGTC 


GTTCTTTGTC 


ACGAGTTTTG 


300 


25 


CCGAAAACTT 


TACAATATTT 


GTTGTCGGTA 


TGATTATTTT 


AACTTTTGGA 


GAAATGTTTG 


360 




TATGGCCAGC 


AGTTCCAACT 


ATAGCCAATC 


AGTTAGCGCC 


AGATGGTAAG 


CAAGGACAGT 


420 




ACCAAGGTTT 


TGTGAATTCA 


GCTGCTACAG 


TAGGAAAAGC 


ATTTGGTCCA 


TTTCTTGGTG 


480 


30 


GTGTATTAGT 


TGATGCGTTT 


AATATGCGCA 


TGATGTTTAT 


CGGTATGATG 


CTACTACTTG 


540 




TATTTGCATT 


AATATTATTA 


A 1 VjVj 1 1 i TLA 


A(jvjAvjAATAA 


TACGCAAC CT 


AAAAAAATAG 


600 




ATGCATAATG 


AGTAAATAGA 


ATTAACGTTA 


TAGACTTGAA 


ATAAATGTCG 


TTATAACATA 


660 


35 


ATATTAATTT 


GTATAATTTA 


ATTTCGTTTG 


GAGCTTTTCT 


ACAGAAAGCT 


AGTGATGCTG 


720 




AGAGCTAGTG 


TTAAGGACTA 


AATGTAAATC 


GTATTAATTT 


TAAATTGAAT 


GAATGACATC 


780 


40 


TCTTACTATT 


AAAATGAGTG 


CACAATTTTT 


GTGAAATAGG 


GTGGTAACGC 


GGCAAATGTC 


840 


GTCCCTATGT 


AAATAGAATA 


GTTAGAGGTG 


TCTTTTTTAT 


TGAATAGGAG 


GAAATGTGTT 


900 




GAATTACAAC 


CACAATCAAA 


TTGAAAAGAA 


ATGGcAAGAC 


TATTGGGACG 


AAAATAAAAC 


960 


45 


ATTTAAAACA AATGATAACT 


TAGGTCAAAA 


GAAATTTTAT 


GCTTTAGACA 


WIUTCCATA 


1020 




TCCATCAGGT 


GCTGGTTTAC 


ATGTTGGACA 


TCCTGAGGGc 


TATACAGCAA 


CAGATATCAT 


1060 




TTCAAGATAT 


AAAAGAATGC 


AAGGATATAA 


TGTATTACAT 


CCGATGGGGT 


GGGATGCATT 


1140 


SO 


CGGATTACCA 


GCAGAGCAAT 


ATGCTTTAGA 


CACTGGCAAC 


GACCCACGTG 


AATTTACAAA 


1200 




GAAAAATATC 


CAAACTTTTA 


AACGACAAAT 


TAAAGAATTA 


GGGTTCAGTT 


ATGATTGGGA 


1260 
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G TT AT AT AAC AAAGGTTTAG CATACGTTGA TGAAGTTGCA GTTAACTGGT GTCCAGCATT 138 0 

AGGCACTGTT TTATCTAACG AAGAAGTGAT TGATGGTGTC TCTGAACGTG GTGGACATCC 1440 

5 AGTTTATCGT AAGCCGATGA AACAATGGGT ACTTAAAATC ACAGAATATG CAGATCAATT 1500 

ATTAGCAGAT TTAGATGATT TAGATTGGCC TGAGTCTTTA AAAGATATGC AGCGCAATTG 1560 

GATTGGACGT TCTGAAGGGG CCAAAGTTTC ATTTGATGTA GATAATACGG AAGGAAAAGT 1620 

10 

AGAAGTATTT ACGACTAGAC CAGATACAAT CTATGGTGCA TCATTCTTAG TCTTAAGTCC 16 8 0 

TGAACATGCA TTAGTTAATT CAATTACAAC AGATGAATAT AAAGAAAAAG TAAAAGCTTA 174 0 

TCAAACAGAA GCTTCTAAAA AGTCAGATTT AGAACGTACA GATTTAGCAA AAGATAAATC 1800 

75 

AGGTGTATTT ACTGGTGCAT ATGCAACTAA TCCTTTATCT GGTGAAAAAG TACAAATTTG 1860 

GATTGCTGAT TATGTATTAT CAACATATGG TACTGGAGCA ATTATGGCAG TACCAGCGCA 1920 

2Q TGATGACAGA GATTATGAAT TTGCTAAAAA GTTTGATTTG CCAATCATTG AAGTCATCGA 1980 

AGGTGGAAAT GTTGAAGAAG CAGCATACAC TGGTGAAGGT AAACATATTA ATTCTGGTGA 2040 

ACTTGATGGT TTAGAAAATG AAGCGGCAAT TACTAAAGCT ATTCAATTAT TAG AG CAAAA 2100 

25 AGGTGCTGGC GAAAAGAAAG TTAATTACAA ATTAAGAGAT TGGTTATTCAr GTCGTCAGCG 2160 

TTATTGGGGC GAACCAATTC CTGTCATTCA TTGGGAAGAT GGAACAATGA CAACTGTTCC 2220 

TGAAGAAGAG CTACCATTGT TGTTACCTGA AACAGATGAA ATCAAGCCAT CAGGGACTGG 22 80 

30 TGAGTCTCCA CTAGCTAATA TTGATTCATT TGTAAATGTT GTAGATGAAA AAACAGGTAT 2340 

GAAAGGACGT CGTGAAACAA ATACAATGCC ACAATGGGCA GGTAGTTGTT GGTATTATTT 24 00 

ACGTTACATC GATCCTAAAA ATGAAAATAT GTTAGCAGAT CCTGAAAAAT TAAAACATTG 2460 

35 

GTTACCTGTT GATTTATATA TCGGTGGAGT AGAACATGCG GTTCTTCACT T ATT AT ATGC 2520 

AAGATTTTGG CATAAAGTCC TTTATGATTT GGCTATCGTA CCTACTAAAG AACCTTTCCA 2580 

AAAATTATTT AACCAAGGTA TGATTTTAGG AGAAGGTAAT GAGAAGATGA GTAAATCTAA 264 0 

40 

AGGAAATGTA ATCAATCCTG ATGATATAGT ACAGTCTCAT GGTGCAGATA CTTTGCGTCT 2700 

TTACGAAATG TTTATGGGAC CTTTAGATGC TGCAATTGCA TGGAGTGAAA AAGGATTAGA 276 0 

4S TGGGTCTCGT CGATTCTTAG ATCGCGTATG GCGTTTAATG GTAAATGAAG ATGGGACATT 2820 

GAGTT CAAAA ATTGTAACTA CAAATAATAA ATCTTTAGAT AAAGTTTATA ACCAAACTGT 2880 

TAAAAAGGTA ACAGAAGACT TTGAAACATT AGGATTTAAT ACTGCTATTA GTCAATTAAT 2 94 0 

SO GGTATTTATT AATGAGTGTT ATAAAGTTGA TGAAGTTTAT AAACCTTACA TTGAAGGCTT 3 000 

CGTTAAAATG TTAGCACCTA TTGCACCACA TATCGGTGAA GAATTATGGT CAAAATTAGG 3 060 
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TGATGAAGTA 


GAAATCGTTG 


TTCAAGTGAA 


TGGTAAATTG 


AGAGCTAAAA 


TTAAAATTGC 


3180 






TAAAGATACA 


TCAAAAGAAG 


AAATGCAAGA 


AATTGCCTTA 


TCTAATGACA 


ATGTTAAAGC 


3240 




5 


GAGTATTGAA 


GGTAAAGACA 


TCATGAAAGT 


CATCGCTGTT 


CCTCAAAAAT 


TAGTCAATAT 


3300 






TGT AG CT AAA 


TAATGTTTTA 


AGGAGGACTT 


TGAAATGAAG 


TCAATTACTA 


CAGATGAATT 


3360 






AAAAAATAAA 


CTTTTAGAAT 


CTAAACCAGT 


TCAAATTGTT 


GATGTTCGTA 


CTGATGAAGA 


3420 




10 


AACAGCAATG 


GGATATATTC 


CTAATGCAAA 


GTTAATTCCA 


ATGGATACCA 


TTCCGGATAA 


3480 






TTTAAATTCA 


TTTAATAAAA 


ATGAAATATA 


TTATATTGTA 


TGTGCTGGTG 


GAGTTCGAAG 


3540 




1S 


CGCTAAAGTT 


GTAGAATATT 


TAGAGGCAAA 


TGGCATTGAT 


GCCGTAAATG 


TCGAAGGCGG 


3600 




CATGCACGCA 


TGGGGCGATG 


AAGGTTTGGA 


AATAAAAAGT 


ATTTAAAGTA 


GTGACATAAT 


3660 






TTAAAATAAT 


ATTACATTTG 


TAATGACACC 


AAGTAACGTT 


TCGGTTGCTT 


ggtgtttttt 


3720 




20 


GGTATGAATT 


ACTTTCTGTT 


ACAAAACAAT 


CTAAAGCGTT 


CTTGTTATGT 


TTTATTAAGA 


3780 






TTTTAATTAC 


AAAACGGAAA 


CTAAATTGTA 


ATAAAATAAA 


ACTTTATTTT 


ATAAAATGAT 


384 0 






GATGATAAAA 


TTGAGTGAAC 


TTAAAATATT 


GTACAAAATA 


ATATAGCTAT 


AAATATAATA 


3900 




25 


TAGCTATAAA 


TATAATATGA 


GGGAGCGTAT 


A'rrrrrAGGA 


TAATTCTTAA 


CAACACAGCA 


3960 . 






GAGAACAGAC 


AACCAGGAGG 


AAAATGAAAT 


GAATTTGTTA 


AAGAAAAATA 


AATATAGTAT 


4020 : 






TAGGAAGTAT 


AAAGTAGGCA 


TATTCTCTAC 


TTTAATCGGA 


ACAGTTTTAT 


TACTTTCAAA 


4080.. 


- 


30 


CCCAAATGGT 


GCACAAGCCT 


TAACTACGGA 


TAATAATGTA 


CAAAGCGATA 


CTAATCAAGC 


4140^ 






AACACCTGTA 


AATTCACAAG 


ATAAAGATGT 


TGCTAATAAT 


AGAGGTTTAG 


CAAATAGTGC 


4200r 






GCAGAATACA 


CCTAATCAAT 


CTGCAACAAC 


CAATCAAGCA 


ACGAATCAAG 


CATTGGTTAA 


4260 V 




35 


TCATAATAAT 


GGTAGTATAG 


TAAATCAAGC 


TACGCCAACA 


TCAGTGCAAT 


CAAGTACGCC 


4320 : 






TTCAGCACAA 


AACAATAATC 


ATACAGATGG 


CAATACAACA 


GCAACTGAGA 


CAGTGTCAAA 


4380 




40 


CGCTAATAAT 


AATGATGTAG 


TGTCGAATAA 


TACCGCATTA 


AATGTACCAA 


CTAAAACAAA 


4440 




TGAAAATGGT 


TCAGGAGGAC 


ATCTAACTTT 


AAAGGAAATT 


CAAGAAGATG 


TTCGTCATTC 


4500 






TTCAAATAAA 


CCAGAGCTAG 


TTGCAATTGC 


TGAACCAGCA 


TCTAATAGAC 


CGAAAAAGAG 


4560 




45 


AAGTAGACGT 


GCGGCACCGG 


CAGATCCTAA 


TGCAACTCCA 


GCAGATCCAG 


CGGCTGCAGC 


4620 






GGTAGGAAAC 


GGTGGTGCAC 


CAGTTGCAAT 


TACAGCGCCA 


TATACGCCAA 


CAACTGATCC 


4680 






TAATGCCAAT 


AATGCAGGAC 


AAAATGCACC 


TAACGAAGTG 


CTGTCATTTG 


ATGACAATGG 


4740 




SO 


TATTAGACCA 


AGTACCAACC 


GTTCTGTGCC 


AACAGTAAAC 


GTTGTTAATA 


ACTTGCCGGG 


4800 






CTTCACACTA 


ATCAATGGTG 


GCAAAGTAGG 


GGTGTTTAGT 


CATGCAATGG 


TAAGAACGAG 


4860 
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15 



20 



25 



30 



35 



40 



45 



50 



TCGTATACAT GGAACTGATA 
AACAGTAAAT CCGAATTCTG 
TCAAGGCGCA ACAAATGTTA 
GACTGTTGAA GGCGGTCCAA 
CAAAATTCAA TTTGTACCTA 
AAAAGATGGT TACAAATACT 
TGTTTTTGTT GAAAGACGAA 
AACAACATCA TTAAAGAATA 
ATATCAAGTT CAATTACCTG 
TCCAAGTAAC AATTCAGGCG 
TCGTGTGATA ACAATTAAAA 
GCCTGATAAA ATACTCGATT 
AACAGTAACA TTTAACGAGA 
AGCTGCAGAA AGTCATACTG 
AGATGCATTA CAAGCCGAAG 
ATTAGATATC TTTAATGGTC 
CAATGTACCA TTAAATAAAA 
GCAACATACG TTAATTCGAA 
AATGGAAGAT TTAGTTAATC 
ACAAGTTATC GAGGAACATA 
TGATGGCGTT ACTAGAATCA 
ACCGGTTGTT AAACCAAATG 
AATTATCAAT GCAACACCAG 
AGCTACGGAT GAAACAGATG 
TGAAACAGCT AAAAATAATG 
TAAAAAAGCT GCAAGAGATG 
TAGTAATAGA GAAGCAACTC 
AACCAACCAT GCTTTAGAAC 
CAAAGGAGAT GGTCTAAATG 



CGAATGACCA TGGCGATTTT AATGGTATCG 
AATTAATCTT TGAATTTAAT ACAATGACTA 
TTATCAAAAA TGCTGATACT AATGATACGA 
CTTTGCGTTT ATTTAAAGTA CCTGATAATG 
AAAATGACGC AATAACAGAT GCGCGTGGCA 
ATAGCTTTGT TGACTCTATC GGACTTCATT 
CAATGGATCC AACAGCAACA AATAATAAAG 
ATGGTAATTC TGGTGCTTCT CTAGATACAA 
AAGGTGTTGA ATATGTGAAC AATTCATTGA 
TTGATGTTAA TGATATGAAT GTTACATATG 
GTACTGGAGG AGGTACAGCA AACTCTCCGG 
TAAGATATAA ATTACGTGTA AATAATGTGC 
CATTAACGTA TAAAACATAT ACACAAGATT 
TAAGTACAAA TCCATATACT ATCGATATCA 
TTGACAGACG TATTCAACAA GCTGATTATA 
TGAAACGACG CGCACAAACG ATTTTAGATG 
GAGTTTCTCA AG CAT AT ATT GATTCATTAA 
GTGTTGATGC TGAAAATGCA GTTAATAAAA 
AAAATGATGA ATTGACAGAT GAAGAAAAAC 
AAAATGAAAT AATTGGTAAT ATTGGTGACC 
AAGATCAAGG TATACAGACG TTAAGTGGGG 
CTAAAAAAGC AATACGTGAT AAAGCAACGA 
ATGCTACTGA AGACGAGATT CAAGATGCAC 
CTATTGATAA TGTTACGAAT GCTACTACAA 
GCATCAATAC TATTGGAGCA GTTGTTCCTC 
CAATTAACCA AGCAACAGCA ACGAAAAGAC 
AGGAAGAGAA AAATGCAGCA TTGAACGAAT 
AAATCAATCA AGCAACAACA AATGCTAATG 
CCATTAATCC AATTGCTCCT GTAACTGTTG 



AGAAAGCATT 
CTAAAAACGG 
TTGCTGAAAA 
TGAGAAATCT 
TTTATCAACT 
CTGGGTCACA 
AGTTTACTGT 
ATGACTTTGT 
CTAAAGATTT 
ATGCAGCAAA 
CACGACTTAT 
CGACACCAAG 
TCATTAATTC 
TCATGAATAA 
CATTTGCGTC 
AAAATCGTAA 
CTAATCAAAT 
AAGTTGACCA 
AAGCAGCAAT 
AAACGACTGA 
ATACTGCAAC 
AACAAAGGGA 
TAAATCAATT 
ATGCTGACGT 
AAGTAACTCA 
AACAAATAAA 
TAACTCAAGC 
TTGATAACGC 
TTAAGCAAGC 



4980 

5040 

5100 

5160 

5220 

5280 

5340 

5400 

5460 

5520 

S580 

5640 

5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 
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TGATGCGACT CAAGAAGAAA GACAAGCAGC AATTGACAAA GTGAATGCTG CTGTAACTGC 6780 

AGCAAACACA AACATTTTAA ACGCTAATAC CAATGCTGAT GTTGAACAAG TAAAGACAAA 6840 

TGCGATTCAA GGAATACAAG CAATTACACC AGCTACAAAA GTAAAAACAG ATGCAAAAAA 6900 

TGCCATCGAT AAAAGTGCGG AAACGCAACA TAATACGATA TTTAATAATA ATGATGCGAC 6960 

GCTCGAAGAA CAACAAGCAG CACAACAATT ACTTGATCAA GCTGTAGCCA CAGCG AAGCA 7020 

AAATATTAAT GGAGCAGATA CGAATCAAGA AGTTGCACAA GCAAAAGATC AGGGCACACA 7080 

AAATATAGTA GTGATTCAAC CGG CAACACA AGTTAAAACG GATACTCGCA ATGTTGTAAA 7140 

TGATAAAGCG CGAGAGGCGA TAACAAATAT CAATGCTACA ACTGGCGCGA CTCGAGAAGA 7200 

GAAACAAGAA GCGATAAATC GTGTCAATAC ACTTAAAAAT AGAGCATTAA CTGATATTGG 7260 

TGTGACGTCT ACTACTGCGA TGGTCAATAG TATTAGAGAC GATGCAGTCA ATCAAATCGG 7320 

CGCAGTTCAA CCGCATGTAA CGAAGAAACA AACTGCTACA GGTGTATTAA ATGATTTAGC 73 80 

AACTGCTAAA AAGCAAGAAA TTAATCAAAA CACAAATGCA ACAACTGAAG AAAAGCAAGT 7440 

GGCTTTAAAT CAAGTGGATC AAGAGTTAGC AACGGCAATT AATmATATAA ATCAAGCTGA 7500 

25 TACAAATGCG GAAGTAGATC AAGCGCAACA ATTAGGTACA AAAGCAATTA ATGCGATTCA 7560 

GCCAAATATT GTTAAAAAAC CTGCAGCATT AGCACAAATC AATCAGCATT ATAATGCTAA 7620 

ATTAGCTGAA ATCAATG CTA CACCAGATGC AACGAATGAT GAGAAAAATG CTGCGATCAA 7680 

30 TACTTTAAAT CAAGACAGAC AACAAGCTAT TGAAAGTATT AAACAAGCTA ACACAAATGC 7740 

AGAAGTAGAC CAAGCTGCGA CAGTAGCAGA GAATAATATC GATGCTGTTC AAGTTGATGT 7800 

AGTAAAAAAA CAAGCAGCGC GAGATAAAAT CACTGCTGAA GTGGcGAacG TATTGaAGCG 7860 

GTTAAACAAA CACCTAATGC AACTGACGAA GAAAAGCAGG CTGCTGTTAA TCAAATCCAA 7920 

TCAACTTTAA AGATTCAAGC AATTTAATCC AAATTTAATC CAAAACCCAA ACAAATGGAT 7980 

TCAGGGTAGG ACACCACTTA CAAATCCAA 8 009 
(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10953 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 

SO (xi) SEQUENCE DESCRIPTION : SEQ ID NO: 62: 

ACCCACCCCn TGGGGATAnT TTACCTGGTG GGGCCTTCGA TTGCCTTTAG GTGAAACCaG 60 



35 



40 
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AGATGAATGC 


TAACCATATT 


CATTCTGCTA 


AAGATGGTCG 


TGTTACTGCG 


ACAGCTGAAA 


180 




TTATTCATCG 


AGGTAAGTCG 


ACACATGTAT 


GGGATATAAA 


AATTAAGAAT 


GACAAAGAAC 


240 


5 


AATTAATTAC 


AGTTATGCGT 


GGTACAGTTG 


CTATTAAACC 


TTTAAAATAA 


AAGAACTGCT 


300 




AGCTGAAATG 


TTATGAGATA 


TTCATAACTA 


CGGCTAGCAG 


TTTTTTTATG 


CGCTATATTG 


360 




TTGTAGTTTT 


AGAAATGCTT 


GTTCAATGCG 


TTCGGCAGCT 


TTACGGCCAC 


CCATAACATT 


420 


l u 


TCTACCAAAT 


GGTC CTAATT 


CTAAGTCTGC 


AAAGCATCCT 


GCGACAAATA 


GATTTGGTAT 


480 




CCATTCTAAT 


TTTTCGGAAA 


TAACAGGGTA 


ATTACATTCG 


TTGATAGGTG 


CATCATAATT 


540 


15 


TTGTATTAAT 


TGCTTAATAA 


GTGGTTGTGA 


CATAAAATCT 


TGTTCAAAAC 


CAGTTGCAAC 


600 


CATAATCTGT 


TGATATGGAA 


CAGAATCATT 


TTCAGTGTTA 


ATTACACCAC 


CACTAATTTG 


660 




AGTGATAGGT 


GTTTTATGCa 


CATTTATACG 


ACCATTTTTA 


ATATGTTTTT 


TAAGGCGTAA 


720 


20 


GTACAGTTCG 


TGAGGCATTG 


ATCCTTTATG 


ACGTTCGCGT 


TGTACAATGG 


CATTTCTTTC 


780 




AGGCATGCTT 


TTAGTACTTA 


AAAATGAAGA 


CATATTTTTC 


GGACCTAACC 


AACCAGGATC 


840 




AGCATCAAAG 


TCATGTATTT 


CAATATCTTT 


ATTTAGCCAT 


AAATGAATCT 


TTTTATCGTT 


900 


25 


ATCATGATTT 


AACAATTTAA 


GTGCAAGATG 


TGCAGCAGTa 


ATGCCGCTAC 


CAACGATATG 


960 




ATCGGTCTTA 


TCATATACTA 


CTTGATCAAG 


TTCTTTCTCG 


AAGATATGAT 


TTACATTCTG 


1020 




TTTGTCTTTT 


AAAATGTCAG 


GCATAAACGG 


AATATTTGTA 


CTGCCTATTG 


CAATAACGAC 


1080 


30 


GCAATCTGTA 


GTGATAATTT 


GTCCATCTTC 


TAACTTGATA 


TGCCATTTGT 


CTTCTTGTTT 


1140 




ATCTAAAGTT 


TGAACTAAAC 


CTTGAACCAA 


GCAATCCTCT 


AATTGATATT 


GTTTAGAAGC 


1200 




ATGTGCAATA 


TGATCCATAA 


ACATTGTCAA 


TTCAGGTCGT 


TGATAAGGAC 


CATAAAAAGC 


1260 




ATTTGTATAT 


TGGTGCTGTT 


TAGCGAATTG 


TTTTAGATGG 


AACGGTTGTG 


GATGTACGTG 


1320 




ATGTACAATC 


GGTGATCTTA 


AATAAGGCAT 


TTCTATTCGA 


TTTGTATATG 


AGTTAAACCT 


1380 


40 


TTGGCAAAAA 


GTTTCGTGTG 


GGTCAATGAT 


TGTTAATCGG 


TCTGTTGTTA 


ATCCGCTTGA 


1440 


TAATAGTTTT 


TGTGCGATTG 


CAGTTCCCTG 


TATGCCACCG 


CCGATAATTG 


TCCAATGCAT 


1500 




AATAAAACCT 


CTCTCTTTTT 


AAAACGTAAT 


AGTTACGATT 


TATAATTATT 


ATTATCATAA 


1560 


45 


TACATAACGA 


CATGAAAGGC 


AATTAAATTA 


AAGAGATATA 


TGTAGATAGG 


GCGAATCTGT 


1620 




AGTCAAAGAA 


AAAATCATTG 


AAAAAGAGGT 


AACAATGTCA 


AAAGAwAACA 


GCAGTAAAAT 


1680 




CATTCCTAAT 


TTGGAATCAT 


CTTACTGCTG 


TTTGTTGTTG 


ATTTATATTC 


ATGATTTTGT 


1740 


50 


TATATAATCT 


ACAATTTTGT 


GTCTTTTAAG 


TCTTCCGAAA 


TTTCATCGAC 


TTTAGTCTTT 


1800 




TTAGTATAAG 


GCGTTTTAAT 


ATTATATGCT 


GCTTTCATAA 


TCATATGACT 


TGAAAGAGGA 


I860 
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GCAATAAAAT ATAAAAACGT ACCAAATAGT AATGACATTG CACCTAATGT TGATGCTTTT 19 80 

CCGGCAGCAT GTGCACGTGA ATATACATCT TCAAGTCTCA ATAATCCTAT AGCTG CTAGG 204 0 

GCGCTAATTA AAGCACCGAT GATAACAAAG ATAAGTGCAA GACTAATCAG TATGATTTTG 2100 

ATCATGTTCA ATCACCTTAC CTTTGTCCAT AAATTTAGAG AATACTGCAG TACCTAAAAA 2160 

AGCTAATATA CCAATCATCA TAATAACGAC AATCATGTAT TTAATATTTA ATAAAATACT 222 0 

GAATAATGCT ATAACTGCCA TTAATTGAAG ACCAATCGCA TCTAATGCGA CAACACGATC 2 28 0 

GGCAAGTGAT GGG CCTAGCA CAACGCGAAT GAG CAT AG CT AA CAT AG AAA TGACAACTAT 234 0 

GATTAATGCA ATAACGATAA TAACATTATG ATTCATTATA TTTCGCCCAC CTCTCTTACA 2400 

ATTTTCTCTA ATGATGTTTT AATACTTTCT ACTTCTTGCT CTTTAGTTGA AAAATCTATG 24 60 

G CATGAAT AT AAATTTTTGT ACGATCGTCA CTTACACCAA GCACTACAGT ACCAGGTGTT 252 0 

AATGTAATTA AATTAGACAG CAAGACAATT TGCCAATCTT TTTTTAAATC TGTGTGATAA 25 8 0 

ACAAAGAATC CtGGTTCATT TTTAATCGAA GGTTTAATAA TAATTTTCAA AACATCAAAA 264 0 

TTAGCTTTAA TCAGTTCGAT TAAGAAAATA ATAACTAATT TAATAATACG ATATAGCGTG 2700 

25 ATGACATAAA AT CTACCTGG TAACACTCTG TGTAAGAGGT AAACAAGAAC TAGGCCAAAG 2760 - 

ATGAAACCTA ACACAAAGTT ATTTGTTGTG TAACTATTTG TCACAAACAA CCAAAACACT 2 820* 

GCGATAATAA AGTTTAATAC TAATTGTACA GCCATGTTAT TTACCTCCTA ATACAGCTTT 2830 

AACGTAGGTT GATGGATTGT AGAATGTTTC TGCACCAGCT TTTACCATTG GATATAAGTA 294 0 

ATCTGCTGAC AATCCATATA AAACAGTTAT CACAACTGCA ACGATTGCAA TCGTAGTTAA 3000- 

ATATTTGACG TCGACTTTGT TATTAAGATC ATATCCTTTT GGTTGACCGA AAAAGCCTTG 3 06 0 ^ 

TAGGAATATG CGAATGACAG AATATAATAC GACTAAACTT GATAATAAGA CGATGACACC 3120- 

ACTTAAATAA AATCCTCTTT CAAATGTTGA TTGGACAATA AAAAATTTTC CATAAAAGCC 3180 

ACTGAGTGGG GGAATGCCAG CTAAACTTAA TGCTGCGATA AAGAATGACC AACCAAGTAC 3 24 0 

AGGATATCGT TTAATTAAGC CACCAAATTG TCTTAAATCA GCAGTGCCTG TAATTTTAAT 3 300 

CATAATTCCG ATAAGCAAGA ATAATGCAAG TTTTACTAAC ATGTCGTGCA ATGTATAGTA 3360 

AATAGCCCCA ATCATACCTG ACTCTGTCAT CATTGCAACG CCGACTAAGA TCACACCTAC 3420 

AGCAATCATG ACATTGTATA GGATGATTTT TTTAATGTTG GCATATGCAA CAGCACCGAC 34 30 

ACAACCAAAG ATGATCGTTA ATAGTGCTAA GAATAAAATG ACATAATGTG AAAAGCTTAC 3 54 0 

50 ATTATCACTA AAGAATAGGC TCAATGTTCT AGCGATTGCA TAAACACCAA CTTTTGTTAA 3600 

CAAAGCACCA AAGAATGCAA TGATTGGAAT TGGTGGgCAT AGTATGCACT AGGTAACCAA 3660 
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ATATTGACTA AGCCACTGTC ATGCGCTGAA AGGTTAGCTA ATTTATTGCT TATATCTGCT 
AGATTCAATG TTCCTACTAC TGAATATAAA ATCGCTACAC CCATTACGAA GAAGGATGAC 
GATACAACGT TAACAAGAAC ATATTTTATT GTTTCTTGTA GTTGAATTTT TGTAGAACCA 
ATTACTAATA AGAAATAAGA TGACATTAAA AATACTTCGA AAAATACGAA TAGGTTGAAA 
ATGTCACCAG TTGTGAATGC ACCAATGATA CCTATTAACA TAAATAGTAC TGAAAAATAA 
TAATAATATC TTTCACGTTC AATACCAATT GTTTGGTATG AATATAAAAT CACAATAGCT 
GTAATAATAA TACTAGTAAT TATTAGTAGG GCACTGAATA TGTCTAATAC AAAGACAATA 

CTGTATGGTG CTTTCCATGA ACCTAGCTCT ACGCGTATTG GTCCATGTTT AACAACATTT 4200 

GCTAAATTGA TAATTGCCGC GACCAAGGTT AATAATGTAC CGCCTAGTGC GACATAACGC 4260 

TTTATAATAG GACGCTTTCC AATAAAGACA AGTAATATGG CTGTAATTAC TGGAATAACT 4320 

20 AGCGTTAACA CAAGCATATT ACTTTCAATC ATCTTCTGGA ACTCCTTTCA TACTCTCAAC 4380 

GTTATCTGTG CCTAATTCTT TATATGTTCT AAATGCTAAT ACTAAGAAAA AGGCTGTTGT 444 0 

CGCAAgGCGA TAACGATTGC TGTTAAAATA AGTGCTTGCG GGaTAGGaTC AACATAGCTT 4 500 

25 TTTACGTTCG CTTCATAAAT TGGAACAGTA CCATGTTTAA GTCCGCCCAT AGTTATTAAA 4 560 

AATAAATTTG CTGCATGTGT TAATAGTGTA GTTCCCATAA CAATTCGTAT CAGACTTTTA 4 620 

GACAAAACGA GATAGACACT AATTGCTGTG AGAATACCAC TAACAAAAAT CATAATAATT 4680 

TCCACTATTC GTTCTCTCCA ATCGAAATAA TAATTGTCAT GACAGTACCA ACTACTGCAC 474 0 

ATAAAACACC GAAATCAAAG AATACTGCTG TTGTCATATG AACAGGTTCT AATATAAATA 4800 

ACGGTATATC AAATGTGACA TGCGTAAAGA AATTTTTGCC TAAAAACCAA CTTGCGATAG 4860 

GCGTCGCAAT ACAAAAAACT AATCCGATAC CTATCAAGAT TTTAAAATCT AATGGGAAAA 4 920 

TTTTACGCAT TGTTTCTATA TCAAATGCAA TCGTAATGAT AACAAGTGAA CTTGCGAATA 4 980 

ATAATCCGCC GACGAAACCG CCACCAGGTG TATAATGTCC TGCTAAGAAA AGTGAAAAAC 504 0 

CAAAGACCAT TACCATGAAA AAGATAATAA CTGCAGCAAA TTGCAAAATT AGATCATTTT 5100 

GTTGTCTATT CATGATTTTT CACCTCGTTA CCTTGCGTTT GACGCTTTTT ACGTAATTTA 5160 

45 ATCATTGTAT ATACAGCTAA TCCTGCGATA CCAAGCACAG ATGACTCGAA TAAAGTATCC 5220 

ATACCACGGA AATCAACAAG TATGACGTTT ACCATGTTTT TACCGTGAGC tAAATCATAA 5280 

ACGTGCTCTT GATAAAACTT AGATATCGAT TCAAAATGTC TATTTCCGTA TGCAATTAAA 534 0 

SO CCGATAATAA TGACGGACAA ACCAACACCA CCAGCAATTA AAGCATTAGT AAGCTGGAAT 54 00 

GAGCGCTTTT CATTATAACG ATTTAAATTT GGTAAGTGGT AGAAGCATAA TAAGAACAAT 54 6 0 
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ATAAACAATA CAGACACAGC ATATCCAACT GCACTTAACA TAATGATGCT AAATAATCTT 5530 

GATTTAGCGA AAAGAATTAA AAAGGCAGCA CTTAATAATA AAATTACGAT ACAAACTTCG 564 0 

5 AAAATTCTAA TCGGACTAAC GTCTTTAAAA TTAATGTTGA AAGGTACTGA GAATATAGTG 5700 

ACAAATGTTA ATAAAATTAA TGCACCAAAA ATGATAACTA AATTATTACG TGAATAATCG 5760 

GTAACATAGC TATTCGTCAT CTTTTCAGAG TAGTTTGGAA TAACATTTGC ACTTCTGTTG 5820 

10 

TACCAATAAT TGAATGTTAG TTTACCAGGT TGTCGTTGCA ACAATTTCAC CCAATAACTA 5880 

AATGTCACAA TTAGTAAGAT ACCTAAAATA TAAATCACTA ATGTTGATAA AAAGGCAGGC 594 0 

GTTAATCCAT GGAACATATG GAATTCAACA TCATCAATTA CCGTATGATT AATCGAAGag 6000 

15 

TnAGCTGGTT CAATAATCGA ATTAGTTAAA ATGCCAGGGA ATAAACCAAA TACAATTACT 6060 

AATGTAGCTA AAATAGCTGG TGATAAAAGC ATTAATATTG ATACTTCGTG TGCTTTTTTA 6120 

2Q GGTAATTGTT CAGGTTTATA TTGTCCGAAA AATATATGCA TTATAAATTT AATTGAATAT 6180 

ACAAATGTGA AGAGACTGCC CACTATACCA ATGATTGGGA ATAGGTAGCC TAATGTATCA 624 0 

ACACTGAATA AATTTGCTTG GCTTGCTGTA AATGTTGTTT CTAAAAATGA TTCTTTTGAT 6300 

25 AAGAAACCAT TGAACGGTGG TACACCAGCg CATACTTAAT GCTGTAATA& CAGTGATTGT 6360 

AAATGAAATA GGCATAATTG TTAGTAAGCC ACCTAATTTC TTAACATCAC GTGTACCAGT 64 20 - 

AGAATGATCC ACTGCACCTG TAATCATAAA TAGGGCACCT TTAAATGTTG CATGGTTGAT 6480 

30 TAAATGGAAT ATTGCAGCCG TAAATGCAGC AGCATATATT TTG CTATCAT CGCCTTGATA 654 0 

GTGATAACTA ATGGCACCGA TTCCAAGCAT CGC CATAATC ATACCTAATT GGGATACTGT 6600 

TGAAAATGCC AGTATACCTT TCAAGTCTTG TTGTTTTGTT GCGTTTAGCG AAgCCCAGAA 6660 

35 

TAATGTAATT AAACCAACGA GTGTGACAGT CCATACCCAA CCTTGCGATG CTGCGAAGAT 6720 

TGGTGTCATT CGAGCGATTA AATATAACCC TGCTTTAACC ATTGTTGCTG AATGAAGATA 6780 

AGCACTGACT GGTGTAGGTG CTTCCATTGC ATCTGGTAGC CAAATATAAA ATGGAAACTG 684 0 

40 

AGCAGATTTT GTAAAAGCAC CAATCATGAT TAAAATCATC GCAAAAATGA AGAATGGGCT 6900 

ATTTTGAATT TCAGAAGCAT GTTGAATCAT GTACTGAATG CTAAATGATT GTGTTGGTAT 6960 

4S AGCGAGTAAG ATGATACCAC CTAATAATGA TAGACCACCA AATACTGTGA TTATGAGCGA 7020 

TTTTTGAGCA CCATATATAG ATGCTTGTCG TTCGCGCCAG AATGAAATAA GTAAAAAACT 70 8 0 

AGAAAATGAC GTTAGCTCCC AGAATAAATA TAGAATAATA ACATTATCTG AAAGTACGAC 714 0 

SO ACCTAACATT GCACCCATAA ATAGTAATAA ATAACAATAA AAATTCCCTA GTTGTTCTGA 7200 

CTTACTTAAG TAGCCGATTG AATATAATAC TACTAAACTG CCGATTCCTO AAATAAGCAA 726 0 
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CCAATTTAAG GTTTTCATTA CAGTATTACC TGACATCGTC GTTTTAATTA ATGTAAGCAT 73 80 

ATAAATAAAT ATGACGATAG GGACAGGTAA TACGAACCAT CCTAAATGTA TACGTTTAAA 7440 

AAATCTATAC AGGATAGGAA TAATGAGTGC GAATATTAAC GGTAATATCA CCGCAATATG 7500 

TAACAAACTC ACTATGTTGT CCTCCTTTAA AAAATATTTA TGTTATTCAT TATACATGAA 7560 

TGATATAGTT CTGAAAAACG TACACACTCC TTGTTGTGCT TTATTTTCAG AaGTATTTAA 7620 

ATAAGAAGAA ACACGTCATT TTTTATTTAA AATTTTCTTT GTATTGAAGT GAATAATCTT 7680 

CTTTTAAGCG TGCTAAACTA GCTAAAGACA TTTCAGCATG TTTTGTTTGC TGAGCTTTAA 774 0 

GTTTAGTTTC TAAATCTGTA ATTGCTTGTT GAAGTGAATC TT CAT AGCGC AATACATCAA 7800 

CATTGAAGTC GCGTAATTGT GAACGTTTCG TATAGCGTTT TTCAAAATGG CTTAATGCTT 7860 

TGCGGTCATG GAAAAATACA CCTTCAGTTT CAGTAGGGTT ATGTAAATCA CCTTGTTTCG 7 920 

on GGTGTTTGAT AACTTGTTCA ACTTTAACAA GGACATCGTC TCCATTTTCT TCAACAATCG 7 980 

TGACACCATA GCTACCTGTT TTGTGTGAAA AT CG AT AT AG CTTCATGCTA TTTTCCTCCC 804 0 

TTAAAAGTAT GTTAATATAT ATGTATCATA ACATGAATGG AGAATATAAA TGGCTAACTA 8100 

25 TCCACAGTTA AACAAAGAAG TACAACAAGG TGAAATCAAA GTGGTTATGC ACACAAATAA 8160 

AGGTGACATG ACATTCAAAT TATTTCCAAA TATTGCACCA AAAACAGTTG AAAATTTTGT 8220 

GACACATGCA AAAAATGGTT ATTATGATGG AATCACATTC CACCGTGTCA TTAATGACTT 8280 

30 CATGATTCAA GGTGGCGATC CAACAGCTAC TGGTATGGGT GGCGAAAGTA TTTATGG CGG 834 0 

TGCTTTTGAA GATGAATTTT CATTAAATGC ATTTAACTTA TATGGCGCAT TATCAATGGC 8400 

TAACTCAGGA CCTAATACTA ATGGTTCACA ATTTTTCATT GTTCAAATGA AAGAAGTACC 8460 

TCAAAATATG TTAAGTCAAC TTGCAGATGG TGGCTGGCCT CAACCAATCG TTGATGCATA 8 52 0 

TGGCGAAAAG GGTGGTACAC CATGGTTAGA TCAAAAACAT ACAGTATTCG GTCAAATCAT 8580 

TGATGGTGAA aCTACATTAG AAGATATTGC AAATACAAAA GTGGGACCAC AAGATAAACC 864 0 

ACTTCATGAT GTTGTAATTG AATCTATTGA TGTTGAAGAA TAATATCTAA ACATAATTAA 8700 

CTACCAACAT TTTAAACTCG GATAAAGCTA ATTTATGAAT GGATTAGTAT ATATTCCAAC 8760 

gAAAATAAAT AAACTAATAT GATGAGCAAT CTCAATATAT TTATCaAGAA AGCACAGTTT 8820 

TTAAATAGAT GTGTATTTTA AAGATAATAG TTGAGGTTGC TTTTTATGTT TTTACAGAGA 8880 

ATTGCTATTC AAATAGTAAA TAAATTGAAA ACAAAGTAGC TGGATATCAT ATTGATTTAG 894 0 

SO ATAGGAATTT GTTGCTAATT TTATTTGTAA ATCCAAGTTT GTAGAATTCT TATTCATTTA 9000 

TAAAATAATA TTCGTATGAT TTGATTTTTT AATTAGTCCA CCATTTCGAT TTGTGCTATG 9060 
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AACATATCAA 


GGTGCGTGTA 


CTGGTATTCA 


AC CATACGGT 


GCGTTTGTTG 


AGACCCCTAA 


9180 




TCATACTGAA 


GGACTGATTC 


ATATATCAGA 


AATTATGGAT 


GACTACGTTC 


ATAATTTGAA 


9240 


5 


GAAATTTCTA 


TCAGAAGGCC 


AAATTGTTAA 


AGCTAAAATT 


TTGTCTATAG 


ATGATGAAGG 


9300 




AAAGCTTAAT 


CTATCATTAA AGGATAATGA 


TTACTTCAAA 


AATTATGAGC 


GTAAGAAGGA 


9360 




AAAACAATCA 


GTATTAGATG 


AAATCAGAGA 


AACAGAAAAA 


TATGGGTTTC 


AAACACTTAA 


9420 


10 


AGAACGCTTA 


CCAATCTGGA 


TAAAACAGTC 


AAAGCGAGCA 


ATTCGAAACG 


ACTAAAGGAA 


9480 




CAGATAAATC 


GTACCGAAAA 


TCATACAAAG 


GGTCTGAAAT 


GAAAGTTTCT 


TAGACTATAA 


9540 


15 


AAGAGATTAG 


TATCTATTAA 


ATTTTATTAG 


ATACTAATCT 


CTTTTTGTCT 


ACGATAACGT 


9600 


AATATGaTTG 


ATTCTATTTA 


CACGTACAAA 


TGGTTTAAGG 


TGACATATCC 


ATTATCTTTG 


9660 




TTAGATAGAA 


TCGTTGATTT 


GCaATATTGT 


ATGTGGATTT 


GTTTTTTTTA 


TTTATTTTAG 


9720 


20 


AAATGAGAAC 


TACAACTTAA 


AGTATTAAAC 


GAATTGCAAC 


TATATAAACA 


GATAATTGGA 


9780 




GAATGAAAAA 


ATTACATGTT 


ATAGTCAACT 


CAATAATTTT 


AAGGAGGAAT 


TAAGTAATGA 


9 84 0 




AAAGTAAATA 


CGAACCATTG 


TTTGATAAAG 


TAGAATTACC 


AAATGGAGTA 


GAGTTGAGAA 


9900 


25 


ATCGATTTGT 


GTT AG CCCCT 


TTAACACATA 


TTTCTTCAAA 


TGATGATGGT 


ACTATTTCAG 


9960 




ATGTAGAACT 


TCCTTATATT 


GAAAAGCGTT 


CACAAGATGT 


TGGTATTACA 


ATTAATGCTG 


10020 




CGAGTAATGT 


GAGTGATGTC 


GGAAAAGCAT 


TTCCAGGACA 


GCCATCAATC 


GCGCATGACA 


10080 


30 


GTAATATTGA 


AGGACTAAAA 


CGATTAGCTA 


CAGCAATGAA 


GAAAAACGGT 


GCCAAAGCAC 


10140 




TCGTACAAAT 


ACATCATGGC 


GGTGCACAAG 


CATTGCCTGA 


ATTAACACCT 


GATGGAGACG 


10200 




TCGTAGCACC 


AAGTCCAATT 


TCTTTAAAAA 


GTTTTGGTCA 


GAAACAAGAA 


CATAGTGCTA 


10260 


35 


GAGAAATGAC 


GAATGAAGAG 


ATTGAACAAG 


CAATCAAGGA 


TTTTGGTGAA 


GCAACGCGAC 


10320 




GTGCAATTGA 


AGCAGGGTTT 


GATGGTGTTG 


AAATACATGG 


CGCGAATCAT 


TACTTAATTC 


10380 


40 


ATCAATTTGT 


ATCACCATAC 


TATAATAGAA 


GAAATGATGT 


ATGGGCAAAT 


CAATATAAAT 


10440 


TCCCGGTCGC 


TGTGATTGAA 


GAAGTACTTA 


AAGCGAAAGA 


AGCGTATGGC 


AATAAAGACT 


10500 




TTATAGTTGG 


ATACAGATTA 


TCTCCAGAGG 


AAGCGGAGTC 


TCCAGGAATC 


ACAATGGAAA 


10560 




TTACAGAGGA 


ACTCGTTAAT 


AAAATTAGCC 


ATATGCCAAT 


CGACTATATT 


CATGTTTCAA 


10620 




TCI AT(V5 ATA r* 


GCATGCAACG 


ACACGTGAAG 


GTAAATACGC 


TGGACAAGAA 


AGACTGCCTT 


10680 




TAATTCACAA 


ATGGATAAAT 


GGTCGTATGC 


CACTTATCGG 


TATTGGTTCA 


ATTTTCACAG 


10740 


SO 


CTGACGAAGC 


TTTAGATGCA 


GTTGAAAATG 


TTGGTGTTGA 


CTTAGTAGCC 


ATTGGTAGAG 


10800 




AGCTACTACT 


GGATTATCAA 


TTTGTTGAAA 


AAATTAAAGA 


TGGACGGGAA 


GATGAAATTA 


10860 
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AATTTAATGA AGGGTTTTAT CCATTACCAC GTA 
(2) INFORMATION FOR SEQ ID NO: 63: 

5 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 

w 

<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 



15 


TTTGATAnAA 


AACTGAATnA 


ATTAAATGTA 


TCGATTCAAC 


CTAATGAAGT 


GAATTTACAA 


60 




GTTAAAGTAG 


AGCCTTTTAG 


CAnAAAGGTT 


AAAGTAAATG 


TTAAACAGAA 


AGGTAGTTTA 


120 




GCAGATGATA 


AAGAGTTAAG 


TTCGATTGAT 


TTAGAAGATA 


AAGAAATTGA 


AATCTTCGGT 


180 


20 


AGTCGAGATG 


ACTTACAAAA 


TATAAGCGAA 


GTTGATGCAG 


AAGTAGATTT 


AGATGGTATT 


240 




TCAGAATCAA 


CTGAAAAGAC 


TGTAAAAATC 


AATTTwCCAG 


AACATGTCAC 


TAAAGCACAA 


300 




CCAAGTGAAA 


CGmAGGCTTA 


TATAAATGTA 


AAATAAATAG 


CTAAATTAAA 


GGAGAGTAAA 


360 


25 


CAATGGGAAA 


ATATTTTGGT 


ACAGACGGAg 


TAAGAGGTGT 


CGCAAACCAA 


GAACTAACAC 


420 




CTGAATTGGC 


ATTTAAATTA 


GGAAGATACG 


GTGGCTATGT 


TCTAGCaCAT 


AATAAAGGTG 


480 




AAAAACACCC 


ACGTGTACTT 


GTAGGTCGCG 


ATACTAGAGT 


TTCAGGTGAA 


ATGTTAGAAT 


54 0 


30 


CAGCATTAAT 


AGCTGGTTTG 


ATTTCAATTG 


GTGCAGAAGT 


GATGCGATTA 


GGTATTATTT 


600 




CAACACCAGG 


TGTTGCATAT 


TTAACACGCG 


ATATGGGTGC 


AGAGTTAGGT 


GTAATGATTT 


660 


35 


CAGCCTCTCA 


TAATC CAGTT 


GCAGATAATG 


GTATTAAATT 


CTTTGGATCA 


GATGGTTTTA 


720 


AACTATCAGA 


TGAACAAGAA 


AATGAAATTG 


AAG CATT ATT 


GGATCAAGAA 


AACCCAGAAT 


780 




TACCAAGACC AGTTGGCAAT 


GATATTGTAC 


ATTATTCAGA 


TTACTTTGAA 


GGGGCACAAA 


840 


40 


AATATTTGAG 


CTATTTAAAA 


TCAACAGTAG 


ATGTTAACTT 


TGAAGGTTTG 


AAAATTGCTT 


900 




TAGATGGTGC 


AAATGGTTCA 


ACATCATCAC 


TAG CGC CATT 


CTTATTTGGT 


GACTTAGAAG 


960 




CAGATACTGA 


AACAATTGGA 


TGTAGTCCTG 


ATGGATATAA 


TATCAATGAG 


AAATGTGGCT 


1020 


45 


CTACACATCC 


TGAAAAATTA 


GCTGAAAAAG 


TAGTTGAAAC 


TGAAAGTGAT 


TTTGGGTTAG 


1080 




CATTTGACGG 


CGATGGAGAC 


AGAATCATAG 


CAGTAGATGA 


GAATGGTCAA 


ATCGTTGACG 


1140 




GTGACCAAAT 


TATGTTTATT 


ATTGGT CAAG 


AAATGCATAA 


AAATCAAGAA 


TTGAATAATG 


1200 


50 


ACATGATTGT 


TTCTACTGTT 


ATGAGTAATT 


TAGGTTTTTA 


CAAAGCGCTT 


GAACAAGAAG 


1260 




GAATTAAATC 


TAATAAAACT 


AAAGTTGGCG 


ACAGATATGT 


AGTAGAAGAA 


ATGCGTCGCG 


1320 
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CTGGTGATGG TTTATTAACT GGTATTCAAT TAGCTTCTGT AATAAAAATG ACTGGTAAAT 
CACTAAGTGA ATTAGCTGGA CAAATGAAAA AATATCCACA ATCATTAATT AACGTACGCG 
TAACAGATAA ATATCGTGTT GAAGAAAATG TTCACGTTAA AGAAGTTATG ACTAAAGTAG 
AAGTAGAAAT GAATGGAGAA GGTCGAATTT TAGTAAGACC TTCTGGAACA aACCATTAGT 

TCGTGTCATG GTTGAAGCAG CAACTGATGA AGATGCTGAA aGATTTGCAC AACAAATAGC 1680 

TGATGTGGTT CAAGATAAAA TGGGATTAGA TAAATAAATA CTGTATTACA AATGAGCCGA 1740 

TGCGTATGcA nTcgtTTTTT GTGTTTGTAG AAATAATTTA TAGTACAAAC GTAAAATGAT 1800 

ATAAACAAAA TAAAAACAAA GTAATCAATA TCTAATATAA AATACACTGG TACTCAATAT I860 

ATAATGATGA TAAAATTAAT TTTAATTAGA TAGAGTTGCT TTGTGTTTTT AACG CAGATG 1920 

CTACTACTTA TCTTAACAGT TGATTAAGTG AAATCATTTA ACAGCGAGAA TAATCAACCA 1980 

20 GGAGGATGAC TTAATGAATT TATTCAGACA ACAAAAATTT AGTATCAGAA AATTTAATGT 2040 

CGGTATTTTT TCAGCTTTAA TTGCCACTGT TACTTTTATA TCTACTAACC CGACAACAGC 2100 

GTCTGCAGCA GAGCAAAATC AGCCTGCACA AAATCAACCA GCACAACCAG CTGATGCCAA 2160 

TACACAGCCT AACGCAAATG CTGGTGCTCA AGCTAATCCT ACAGCACAGC CAGCTGCACC 2220 

TGCCAACCAA GGACAACCAG CAGTACAACC AGCAAACCAA GGTGGACAGG CTAATCCAGC 2280 

AGGAGGAGCA GCACAACCAA ATACACAACC AGCTGGACAA GGTGATCAAG CTGATCCGAA 2340 

TAACGCTGCA CAAGCACAAC CTGGAAATCA AGCAACACCG GCAAACCAAG CAGGTCAAGG 2400 

AAATAACCAA GCAACACCTA ATAATAATGC AACACCGGCA AATCAAACAC AGCCAGCGAA 2460 

TGCTCCAGCA GCAGCGCAAC CAGCAGCACC TGTAGCAGCA AACGCACAAA CTCAAGATCC 252 0 

AAATCCTAGC AATACTGGTG AAGGCAGTAT TAATACGACA TTAACATTTG ATGATCCTGC 2580. 

CATATCAACA GATGAGAATA GACAGGATCC AACTCTAACT GTTACAGATA AAGTAAATGG 2640 

TTATTCATTA ATTAACAACG GTAAGATTGG TTTCGTTAAC TCAGAATTAA GACGAAGCGA 2700 

TATGTTTGAT AAGAATAACC CTCAAAACTA TCAAGCTAAA GGAAACGTGG CTGCATTAGG 2760 

TCGTGTGAAT GCAAATGATT CTACAGATCA TGGTAACTTT AACGGTATTT CAAAAACTGT 2820 

AAATGTAAAA CCAGATTCAG AATTAATTAT TAACTTTACT ACTATGCAAA CGAATAGTAA 2880 

GCAAGGTGCA ACAAATTTAG TTATTAAAGA TGCTAAGAAA AATACTGAAT TAGCAACTGT 2940 

AAATGTTGCT AAGACTGGTA CTGCACATTT ATTTAAAGTA CCAACTGATG CTGATOGTTT 3 000 

AGATTTACAA TTTATTCCTG ACAATACAGC AGTTGCTGAT GCTTCAAGAA TTACAACAAA 3 060 

TAAAGATGGT TATAAATACT ATTCATTCAT TGATAATGTA GGTCTATTCT CAGGATCACA 3120 



25 



30 
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TAATACTGAA 


> ATCGGTAACA 


a'1 '/"I/ '7A n 1 1'**!**! 1 

nluvlivil X X 


1 WjTGuiCA 


TTAAAAGCAG 


ATCAATTTAA 


3240 




ATATGAAGTA ACATTACCAC 


AAGGTGTAAC 
**f»ww x w x aa\_> 


X InLvjl 1 An 1 


T\ TV irwpp TV • |Mi* *v TV 

AA1 X Lift. X IAA 


CTACAACATT 


3300 


5 


CCCTAATGGT 


AATGAAGACA 


GTACAGTATT 


f3 21 21 21 21 IV X 11 TV* 


ACTGTTAATT 


ATGATCAAAA 


3360 




TGCAAATAAA 


GTTACATTTA 


CAAG rrAAnn 


X X bALAALG 


G CACGTGGTA 


CACACACTAA 


3420 




AGAAGTTTTA 


TTCCCAGATA 


AATCTTTAAA 


21 r P r P 21 *TY* A *T* a T* 
aX 1MJ LniAi 


TV TV TV /"'^*T' TV TV 

AAAGX 1 AATG 


TTGCGAATAT 


3480 


10 


CGATACACCT 


AAAAATATTG 


A X X X X AA X V7 A 


212121 21 TT a i^a 
KHnAi 1AALA 


I ATCGTACTG 


CTTCAGATGT 


3540 




TGTAATTAAT 


AATGCGCAAC 


CAGAARTa PA 


X AAL 1 bUAG 


ATCCATTTTC 


AGTAGCGGTT 


3600 


15 


GAAATGAACA 


AAGATGCGTT 


GCAAC21AC!AA 


uiaMv. X UAL 


TV TA IWIU V TV FY! IV m 

AAGTTGATAA 


TAGTCATTAC 


3660 


ACAACAGCAT 


CAATTGCAGA 


ATAPAATAAA 
AX ALnn 1 aaa 


l-X X AAAt_AAC 


TV TV ^»^*1V «v mm 

AAGCAGATAC 


TATTTTAAAT 


3720 




GAAGATGCGA 


ATCATGTTAA 


A & rrrir* a a a*r* 

aaV- x VjLAAAi 


CGTGCATCTC 


AAGCGGATAT 


TGATGGTTTA 


3780 


20 


GTAACTAAAT 


TACAAGCTGC 


ax 1 nn J. luii X 


AAT CAAG CAG 


CAATTGCTGA 


ATTAGATACT 


3840 




AAAGCTCAAG 


AAAAGGTTAC 


a r;r* a a /^a a 


CAAAGTAAAA 


AAGTTACGCA 


AGATGAAGTT 


3900 




GCAGCACTTG 


TAACTAAAAT 


to, tra aTfiT 

x nnUnn Ignl 


AAAAATAATG 


CAATCGCAGA 


AATTAATAAA 


396a 


25 


CAAACTACAG 


CACAAGGTGT 




AAAGATaATG 


GTATCGCAGT 


GTTAGAACAA 


4020 




GATGTGATTA 


CACCAACAGT 




^ /"'I* -1 TV 7V IV /"» TV TV 

GCGAAACAAG 


ATATTATCCA 


AGCAGTTACA 


4080 




ACTCGTAAAC 


AACAAATTAA 


Annu X Winn X 


Gv_LA X LAxTAC 


TV TV /"^ TV ^n^^ • « • a 

AAGATGAAAA 


AGATGTAGCA 


4140 


30 


AATGATAAAA 


TTGGTAAAAT 


TGAAACAAAG 


GCAATTAAAG 


TV TV ' I M 1 TV 

ATATTGATGC 


AGCAACAACA 


4200 




AATGCACAAG 


TAGAAGCCAT 


TAAAACAAAA 


GCAATCAATG 


JV A TV A *T»/""» TV 

ATATTAATCA 


AACTACACCT 


4260 




GCTACAACAG 


CTAAAGCAGC 


AGCTCTTGAA 


GAATTTGACG 


AAGTTGTTCA 


AGCACAAATT 


4320 


35 


GATCAAGCAC 


CTTTAAATCC 


TGATACAACA AATGAAGAAG 


TAGCGGAAgC 


TATTGAACGT 


4380 




ATTAATGCAG 


CTAAAGTTTC 


TGGTGTTAAA 


GCAATTGAAG 


CGACAACGAC 


TGCACAAGAT 


4440 


40 


TTAGAAAGAG 


TTAAAAACGA AGAAATCTCA AAAATTGAAA ATATTACTGA CTCTACGCAA 


4 500 


ACAAAAATGG 


ATGCCTATAA 


TGAAGTTAAA 


CAAGCTGCAA 


CAGCTAGAAA 


AGCTCAAAAT 


4560 




GCTACAGTTT 


CAAATGCAAC 


AAATGAAGAA 


GTAGCAGAAG 


CTGATGCAGC 


AGTAGATGCA 


4620 


45 


GCTCAAAAGC 


AAGGTTTACA 


TGACATCCAA 


GTTGTTAAAT 


CAAAACAGGA 


AGTTGCTGAT 


4660 




ACAAAATCAA 


AAGTATTAGA 


TAAAATCAAT 


GCAATTCAAA 


CACAAGCAAA 


AGTTAAACCT 


4740 




GCAGCTGATA 


CGGAAGTAGA 


AAACG CAT AT 


AATACACGTA 


AACAAGAAAT 


TCAAAATAGC 


4800 


SO 


AATGCTTCAA 


CTACAGAAGA 


AAAACAAGCT 


GCATATACAG 


AATTAGATAC 


TAAAAAGCAA 


4860 




GAAGCAAGAA 


CAAATCTTGA 


TGCTGCAAAT 


ACAAACAGTG 


ATGTAACAAC 


AGCTAAAGAC 


4920 
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GCGGAAATCG 


CTCAAAAAGC 


AAGTGAACGT 


AAAACAGCAA 


TTGAAGCAAT 


GAATGATTCG 


5040 




ACTACTGAAG 


AACAACAAGC 


AGCGAAAGAC 


AAAGTGGATC 


AAGCAGTAGT 


TACTGCAAAC 


5100 


5 


GCTGATATAG 


ATAATGCTGC 


AGCAAACAAT 


GATGTGGATA 


ATGCAAAAAC 


TACAAATGAA 


5160 




GCTACAATCG 


CAGCCATTAC 


ACCTGATGCA AATGTTAAAC 


CAGCAGCAAA 


ACAAGCAATT 


5220 




GCAGATAAAG 


TACAAGCTCA 


AGAAACAGCA 


ATTGATGGAA 


ATAACGGCTC 


AACAACTGAA 


5280 


10 


GAAAAAGCAG 


CTGCTAAACA 


ACAAGTTCAA 


ACTGAAAAAA 


CAACAGCTGA 


TGCCGCAATA 


5340 




GATGCAGCAC 


ATACAAATGC 


GGAAGTTGAA 


GCGGCTAAAA 


AAGCAGCAAT 


TGCTAAAATT 


5400 


15 


GAAGCGATTC 


AGCCAGCAAC 


AACAACTAAA 


GATAATGCGA 


AAGAAGCAAT 


TGCTACGAAA 


5460 


GCGAATGAAC 


GTAAAACAGC 


AATCGCTCAA 


ACG CAAGACA 


TTACTGCTGA 


AGAAATTGCA 


5520 




GCGGCTAATG 


CGGACGTAGA 


TAATGCTGTG 


ACACAAGCAA 


ATAGCAACAT 


TGAAGCTGCT 


5580 


20 


AATAGTCAAA 


ATGATGTAGA 


CCAAGCGAAA 


ACGACAGGTG 


AAAATAGTAT 


TGATCAAGTA 


5640 




ACACCAACAG 


TTAATAAAAA 


AGCAACTGCA 


CGTAATGAAA 


TCACAGCAAT 


TTTAAATAAC 


5700 




AAATTGCAAG 


AGATTCAAGc 


tACGCCAGAT 


GCAACAGATG 


AAGAAAAACA 


AGCAGCTGAT 


5760 


25 


GCTGAAGCAA 


ATACTGAAAA 


TGGTAAAGCA 


AATCAAGCCA 


TTTCAGCAGC 


AACTACTAAC 


5820 




GCACAAGTTG 


ATGAAG CTAA 


AGCAAATGCA 


GAAGCAGCGA 


TTAATGCGGT 


AACACCAAAA 


5880 




GTTGTGAAGA 


AACAAGCGGC 


TAAAGATGAA 


ATTGATCAAT 


TACAAGCAAC 


GCAAACAAAT 


5940 


30 


GTTATCAATA 


ATGATCAGAA 


CGCTACAACA 


GAAGAAAAAG 


AAGCAGCTAT 


TCAACAATTA 


6000 




GCAACAGCAG 


TTACAGACGC 


GAAAAATAAT 


ATTACAGCTG 


CAACTGATGA 


TAATGGTGTA 


6060 




GATCAGGCGA 


AAGACGCTGG 


AAAGAATTCA 


ATTCAAAGCA 


CGCAACCAGC 


AACAGCGGTT 


6120 


35 


AAATCAAATG 


CTAAAAATGA 


TGTTGATCAA 


GCTGTGACAA 


CTCAAAATCA 


AGCAATTGAT 


6180 




AATAGAACTG 


GTGCTACAAC 


TGAAGAGAAA 


AATGCAGCAA 


AAGATTTAGT 


TTTAAAAGCT 


6240 


40 


AAAGAAAAAG 


CGTATCAAGA 


TATCTTAAAT 


GCACAAACAA 


CTAATGATGT 


TACGCAAATT 


6300 


AAAGATCAAG 


CAGTTGCTGA 


TATTCAAGGT 


ATTACTGCAG 


ATACAACAAT 


TAAAGATGTT 


6360 




GCGAAAGATG 


AATTAGCAAC 


AAAAGCAAAC 


GAACAAAAAG 


CGCTTATTGC 


ACAAACTGCA 


6420 


45 


GATGCGACTA 


CTGAAGAAAA 


AGAACAAGCA 


AATCAACAAG 


TAGACGCACA 


ATTAACACAA 


6480 




GGTAATCAAA 


ATATTGAAAA 


TGCACAGTCA 


ATCGATGATG 


TAAACACTGC 


AAAAGATAAT 


6540 




GCAATTCAAG 


CAATTGACCC 


AATTCAAGCA 


TCAACAGATG 


TTAAAACGAA 


TGCAAGAGCG 


6600 


50 


GAATTGCTAA 


CTGAAATGCA 


AAATAAAATA 


ACTGAAATAC 


TTAATAATAA 


TGAGACTACT 


6660 




AATGAAGAAA 


AAGGTAACGA 


TATTGGACCA 


GTTAGAGCAG 


CATATGAAGA 


AGGTTTAAAT 


6720 
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is 



20 



25 



40 



AAAGTTCAAC 


AACTTCATGC 


AAATCCTGTT 


AAGAAACCAG 


CAGGTAAAAA 


AGAATTAGAT 


6840 


CAAGCTGCAG 


CTGATAAGAA 


AACACAAATA GAACAAACAC 


CAAATGCATC 


ACAACAAGAA 


6900 


ATTAATGATG 


CAAAACAAGA 


AGTTGATACT 


GAATTAAATC 


AAGCGAAAAC 


AAATGTCGAT 


6960 


CAATCATCAA 


CAAATGAATA 


TGTTGATAAT 


GCAGTTAAAG 


AAGGAAAAGC 


TAAAATTAAT 


7020 


GCAGTTAAAA 


CATTTAGTGA 


GTACAAAAAA 


GATGCTTTAG 


CTAAAATTGA 


AGATGCATAT 


7080 


AATGCTAAAG 


TAAACGAAGC 


GGATAACTCT 


AACG CATCG A 


CTTCAAGTGA 


AATTGCTGAA 


7140 


GCGAAACAAA 


AACTTGCTGA 


ATTAAAACAA 


ACTGCGGATC 


AAAATGTTAA 


TCAAG CT ACT 


7200 


TCTAAAGATG 


ACATTGAAGT 


TCAAATTCAT 


AATGACTTAG 


ATAATATTAA 


CGATTACACA 


7260 


ATTCCAACAG 


GTAAAAAAGA 


ATCAGCTACA ACAGATTTAT 


ATGCTTATGC 


AGATCAGAAG 


7320 


AAAAATAATA 


TTTCAGCTGA 


CACTAATGCA 


ACACAAGATG 


AAAAGCAACA 


AGCAATTAAG 


7380 


CAAGTTGACC 


AAAATGTTCA 


AACTGCATTA 


G AAAG CATTA 


ATAATGGTGT 


GGATAATGGT 


7440 


GACGTTGATG 


ATGCATTAAC 


ACAAGGTAAA 


GCAGCAATTG 


ATG CT ATTCA 


AGTAGATGCT 


7500 


ACTGTTAAAC 


CTAAAGCGAA 


CCAAGCTATT 


GAAGTTAAAG 


CAGAAGATAC 


GAAAGAATCT 


7560 


ATTGATCAAA 


GTGACCAGTT 


AACTGCTGAA 


GAAAAAACTG 


AAGCATTAGC 


AATGATTAAA 


7620 


CAAATTACAG 


ATCAAGCTAA 


ACAAGGTATT 


ACTGATGCAA 


CAACAACTGC 


TGAAGTTGAA 


7680 


AAAG CGAAAg 


CTCaAGGACT 


TGAAGCATTT 


GATAACATTC 


AAATCGACTC 


AACAGAAAAA 


7740 


CAAAAAGCTA 


TCGAAGAATT 


AGAAACTGCA 


CTAGACCAGA 


TTGAAGCAGG 


TGTAAATGTC 


7800 


AACGCTGATG 


CTACAACTGA 


AGAAAAAGAA 


GCGTTTACGA- 


ATGCTTTAGA 


AGACATTTTA 


7860 


TCAAAAGCAA 


CTGaAGATAT 


TTCTGATCAA 


ACTACAAATG 


CAGAAATCGC 


TACTGTCAAA 


7920 


AATAGTGCGC 


TTGAACAACT 


TAAAGCACAA 


CGTATTAATC 


CTGAAGTTAA 


GAAAAATGCT 


7980 


TTGGAAGCAA 


TCAGAGAAGT 


GGTTAACAAG 


CAAATAGGAA 


tAATTAAAAA 


TGCAGATGCA 


BO40 


GATGCATCGG 


CGGAAAGAnA 


TTGCACGTAC 


GGGATTTAGG 


TAGATATTTT 


GGACCGATTT 


8100 


GCTGGATAAA 


TTTAGGGTnA 


AACCCCAACC 


AATGCCGAAG 


TTGCCTGAAT 


TACCA 


8155 



(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1630 base pairs 

(B) TYPE: nucleic acid 

(C) STRAND EDNESS : double 

(D) TOPOLOGY: linear 



SO 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 
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CTGTTTTATT TGCAGCACCC ATACTGGAAA TCACTTTAAT CCCTCGGTCA AGACACTCTT ~ 120 

TCATTAAGTG TACTTTGTAC ATTATTGTAT CACTTGCATC TACAAAATAA TCTATATCGT 180 

AGTTATCGAA AATTTCTTCA TATGTCTCTT CTGTATAAAA CATATGTAAG GGCGTGACTT 24 0 

TACAATCTGG ATTAATTAAT TTAATACGTT CTTCCATCAA AGAAACTTTA CTTTGTCCTA 300 

CCGTTGTAGT TAAAGCGTGT AATTGTCTGT TTACATTTGT AATATCAACA T CAT CTTTAT 36 0 

CTATTAATAT AATATGACCA ATATTCGTTC TTGCTAATGC TTCAGCAGCA AATGAACCAA 42 0 

CACCTCCAAC GCCAAGTATG ACAACAGTTT GTTGCTTCAA TAAATCTAAA CCTTGTTGTC 4 80 

CAATCGCTAG TTCATTTCTT GAAAATTGAT GTTTCATTAT TTTACCTCTT TCACTGATTT 54 0 

ATACATAAGT ACATAGTAAC TTAAAATTTT ATATTTAGCA TTATCACTTT GATTATTTTC 600 

CCAAAATTCA ACGAGGAAAC ATTTATTAAA CGCTATAAAA CCCAACTAAT TCTTTATTAA 66 0 

AAACTTAAAG AAACGCATAA AAATACGCAA GACAAAGTCT TGCGTATCGA TAGAGTCCGT 720 

ATTGCCGTAG TTATAATAGC TTGATCATTC GGCCTGTTAT ATACAGGTGG GTGCCCTGTT 78 0 

TCTTGTTTTG TACGTCCTTC ATATAAGGCG TGTACGCTGC AAGAAAACCC ATTGGGCTCC 84 0 

2S CTTGATCAAA GAGTGTTAGG CCCAAATTAA AAAGCAAACT TACGAACAAC TCAGATGACT 90 0 

ATCTTATGAT GTTATATTAC CACATAATTA AAATTAATGA AATTATAACA AACCAAAGTT 960 

TATTGATTTT TTAAAATTTA GTGACGAATT CGCAAAGAAA GTTCTTCTAA TTGTTTATCA 1020 

30 GAAACTTCAC TAGGCGCATT CGTTAATAAA CATGTAGCAG ATGCTGTTTT AGGGAATGCG 108 0 

ATTGTATCTC TCAAGTTTGT TCTATTAGTC AATAACATGA CTAATCGGTC tAATCCTAAT 114 0 

GCAATACCGC CATGTGGTGG TG CAC CAT AT TTAAATGCAT CTAGTaAGAA GCCGAACTGT 120 0 

TCCTgTGCTT GTTCTTTAGT AAATCCAAGA ACTTCGAACA TTTTTTCTTG TAACTCACCA 1260 

TCATGAATTC TGATTGAACC GCCACCTAAT TCATAACCAT TTAATACTAT GTCATAAGCA 132 0 

TTTGCCTCAG CTTCtTCTGG CGCAGTGCCA AG CTTAGCAA TATCAGCTTC TTTTGGAGAT 13 8 0 

GTAAATGGAT GATGTGCTGC AACGTAACGT TTCGCATCTT CATCATATTC TAATAATGGC 144 0 

CAATCTGTCA CCCATAAGAA GTTTAATTTT GTTTCATCGA TTAAACCTAA TTCTTTAGCT 1500 

AATTTGACAC GTAATGCACC TAAACTTTGT G CAACG ACAT TTGGTttGTC TGCAACAAAC 1560 

ATTACTAAGT CACCAGCTTC AG CAC CAGTT AATGTAAGTA ATGTTTCAAC ATTTTCTGTT 1620 

cAAAGAAACG 1630 

SO (2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 732 base pairs 
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(C) STRANDEDNESS: double 

(D) TOPOLOGY; linear 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

CAATTGGACA TCTTGTATGA AAAGGACAAC CTTGCGGCGG ATTACTTGGC GAAGGTAATT 60 

10 CTCCTTTTAA TATAATTCTA TTGTTATTAT GTTTATCAAT TTGTGGTATT GATGAAATCA 120 

ACGCTTTTGT ATATGGATGT TTGGGATTTT CATAAATTTC TTTATCAGAT GCGATTTCAA 18 0 

CTATATGACC TAAATACATA ACTCCAATGA CATCACTTAT ATGTTTTACT ACACTTAAAT 240 

75 CATGTGCGAT AAATAAATAG CTTAAGTTAA ATTGTTCTTG TAAATCTTTT AATAAATTCA 300 

GTACTTGAGA TTGAACAGAT ACATCTAATG CACTTACAGG CTCATCAGCA ACAATTAAAC 360 

TCGGACGCAA AGCCAATGCT CTTGCAATTC CCACTCTTTG TCTCTGTCCA CCTGAAAATT 420 

CATGTGCATA TTtATAATAT GCATCTTCAC TTAGGCCAAC ACATTTTAAT AAATATAGTA 480 

CTTCTTTTTT TATTTCTTCT TTTGGCAATT TTTTATAATT TAAAATAGGT TCTGAAATGA 540 

TATCTCCAAC CATTTGCATC GGATTCAATG ATGCATACGG ATCTTGAAAT AT CAT CTGAT 600 

ATTGTTGTCG TGATTTTCTG AGTTTTTTAC CTTGTAATCT TGTTATATCT TCACCATTAA 660 

CAATTATTGA GCCTGAAGTT GCATCTTCAA GC CTGAT AAT CACTTTACCT AACGTTGACT 720 

TACCACAACC CG 732 
(2) INFORMATION FOR SEQ ID NO: 66: 



20 



25 



30 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5838 base pairs 
35 (B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



40 



45 



SO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

AATATATTCA TATGTTTCAT CAACAATATT AGCTGCTTTT TGAATTAAAG CAATTTCGTC 6 0 

AGCATCTTTG ACGTCTCTAA TTTTATCTAC AGTATTAGAA ATGCTTATTA ATGATATACG 120 

GCTTTTATTT AATTCAAGGT ATGT AT CATA ACTTACATGA TGCCCCTCAA AACCTACATT 180 

TTCAAAATTT TCTTGGTGTA GCAATTCTTT AATCTCACCA ATAATAGTAG ATTTACGATT 24 0 

AATAATTTCA TAATTTGGCG CCTGCTTAGT TGCTTGATCA ATATATCTAA AGTCTGTTAT 300 

CAAATATTGT TTATCTTTAG ATATGATAAG TGCTCCACTG GTACCAGTAA AACCTGATAA 360 

ATATCTTCTA TTGTAATCCG AAAGAATGaT AATCGCATCT AAATGTTTTT GTTCTAAAAT 4 20 



55 
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CAACTTTATA CATTAAAATA ATATCATAAT 
GGGAGATAGT AATGAAAAAA TTGGTTTCAA 
5 GTGGATCACA AAATTTAGCA CCATTAGAAG 

ATCAACTCAA ACTAGATATT CAAGAACTTA 
TTAAAGGGCT TGAAAAGGAT AAAGAAAACA 

W 

TTAAATTGAT GAATGTTACA TCAACATACT 
ATAACGATAT TGAGAAAGAT GTAAGTAAAA 
TAAATCAAAT TTCTAATGAT ATTCAAAGTG 

15 

GTTTATCACT TAGTGATGAT GATAAAAAAA 
ATTTGAATCA TGCATTTGAT GATATTAAAA 

20 TTACAAAAGG ACAACAAGCG TTGTCAAAAT 

CTTTTAATGC GTAATATAAT ATTTTATCTT 
ATGAATCTAG ATGCCTTTAT TTTTTCAATC 

25 CTGGTATTAT TTATCTGATT TATTATTTCT 

GCAAAGCAAT GCgTrAaGTA TAAAAGAAAT 
CACTTGTAGG TAAAATAGTC TACGTGCTTC 

3D 

CATCCATTCA TCTGAACGAT ATTTTTCAGT 
TGtTAATTCA AGTGGCTTTA ATTCTATATT 
TTCTTCCATT TGACTAATAG TAATGTGTTC 

35 

TTCAACGAAT GCCTCTTTCA TTTTTAATTT 
AAACAGTTCA TCAATATCAA TATCTTGTAA 
TTGTCTCGTT TGAGCACTCC CAGCAATCTT 

40 

TGGTGCATCA AAACACACTG AACTTCGAGG 
TTTAGGTACC GCAAAATAAG TATCAAATCC 

45 TGAAATCACT CTGTACGCTT CTGTAACTGT 
AATCACACTG TAAGTTAACT CTTTATCATG 
TACGAGACCA AAACCTTTCT CTTTAACCTT 

50 GAAATACCCT ATTGATAATG TTGCAGGATT 
TTCACCTCTA GAGACAAAAT TTAATAACGC 
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AAGGATAAAA AATAATAGAT ATTGATTTTA 540 
TTGTTGGCGC AACATTATTG TTAGCTGGAT 6 00 

AnAAAACAAC AGATTTAAGA GAAGATAATC 660 
ATCAACAAAT TAGTGATTCT AAATCTAAAA 720 
GTAAAAAAAC TGCATCTAAT AATACGAAAA 7 80 

ACGACAAAGT TGCTAAAGCT TTGAAATCCT 84 0 

ACAAAGGCGA TAAGAATGTT CAATCGAAAT 900 
CTCACACTTC ATACAAAGAT GCTATCGATG 960 

CGTCTAAAAA TATCGATAAA TTAAACTCTG 102 0 

ATGGCTATCA AAATAAAGAT AAAAAACAAC 1080 

TAAACTTAAA TGCAAAATCA TGATAGGAGT 114 0 

GTACTTATTA TTGCTGCGAT TGGATTAGTA 12 00 

GTCAGAATGT TAATCAGCTT TGcgTAaTAG 1260 

TCATCTTAAC TGAAGACCjLA CGCAAATATC 1320 

CAAAGAAGAA AATAGATAAA AAAACGGAAG 1380 

CATTTTTTAT TCTAAAAACT ACTTTCTAAA 144 0 

TAATTCTTCC ACTTCTGCCA ATTGAGCTTC 1500 

TAAACCTTTC TTAAAACCTT TCTCGAAAGC 1560 

ATCTGAAATA TCATTGATGG CAACTGCTTT 16 20 

TAATCTTTCA TTTTTATAAA TrAACATATC 1680.: 

AATCGAACCG TGTTGGAGGA TTACGCCCTT 1740 

ACGGCCTTCA ACAACTAGCT CATACCAACT 1800 

TTGTTTTAAT TTTTGACGCT CTTCAGGCGT 1860 

TAAGTTTTTA AATCCTTCTA ATAATCCTTG 1920 

AGAAGGCATA TTCGGATGCG ATTCAGGCAC 1980 

TAGCACCCCA CGGCCACCAG TTTGACGCCT 2040 

ATCAATATCA ATTTCTTTTT GTAGCCTTTG 2100 

CCATGTGTAA AAACGTATAA CTGGATCAAT 2160 

TTCATCCATT GCCATATTAT AATATGGGTC 2220 
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AAATGTATAA TATTTGATTC GCTAATTAAT CAATTTAACT AAATGAATAA TAATTGCAAT 2340 

TCTTTAGTGA AATATTTTGA TAATTTGACC TAACAGTCTT ATAATTATAT TATCGTTTAA 24 00 

TTAGGGAGGA TGCAAGATGA GTGCTAGTTT GTACATCGCA ATAATTTTAG TTATAGCAAT 2460 

TATTG CTTAT TATGATTGTTC AACAAATTCT TAACAAGCGA GCTGTTAAAG AATTAGATCA 2520 

AAATGAATTC CATAATGGGA TTAGAAAAGG TCAAGT CATC GATGTTAGAG AGAAAGTTGA 2580 

CTATGACTAC GGTCACATTA ATGGGTCTCG CAATATTCCT ATGACAATGT TCAGGCAACG 264 0 

ATTCCAAGGA TTAAGAAAAG ATCAACCGGT ATACTTATGT GATG CCAATG GGATTGCTAG 27 00 
CTATAGAGCC GCTCGTATTT TGAAAAAGAA TGGATATACA GATATCTATA TGTTAAAAGG ~ 2760 

CGGCTATAAA AAATGGACTG GAAAAATAAA GTCTAAAAAA TAGTTTTTGT AAATTTAATA 2820 

TACGATTTAA TAAAATCTGA GTGTTAATTG ATCATCAATA ACAATACTCA GATTTTAATT 2880 

20 TTTTAACAAA GTCTGTTACT ATATTTCTCT AGCTTCACTG ATCATTAAAC TTAGTTTCAG 2940 

CATAATAAAG AAAGTTCAGC TCATTTTCAA TACGATTCAA TTACCGCAAT CTAAAAAATG 3000 

AAAAGACAAT TTCTATGAAA GAATAATACC AAACCCTAAG AGTTATTACT TCGGTTTAGT 3060 

25 TTTCTTGTTT AAATAGAAAT TGTCTTTTTC AATTGATTTT GAAACCATTA TCCTTAAATC 3120 

TTCATACAAA GTTAGAATAA TAATTCTCGG AATATGTGTT TAATACTTTA TTTTTCCTGT 3180 

TTAAGATTTT CAAACTTTAA TATTGGTTTA CGAGCAGCTG TAGCTTCGTC TAATCGATCA 324 0 

ATCACAGTTG TATGTGGTGC TTCTAGCacT TTATCAGGAT CATTTTTAGC TTCTTCAGCA 33 00 

ATACTAATTA ATGTATCGAT AAAATAATCA AGTGTTTCTT TAGACTCTGT CTCAGTCGGT 33 60 

TCAATCATCA TACCTTCTTC AACATTTAAT GGG AAGTATA TTGTTGGTGG ATGTACACCG 3420 

AAATCTAATA ATCGCTTAGC CATGTCTAAA GTACGTACAC CAAATTCTTT TTGACGCACA 34 80 

CCACTTAACA CAAACTCGTG TTTACAATAT TGTTTATAAG GTATTTCAAA GTGTTTAGAT 354 0 

AAACGTGCTT TAATATAATT CGCATTAAGA ACCGCTGCTT CAGAAACCTC TTTAAGTCCA 36 00 

GTTGCTCCCA TAGTTCGAAT ATACGTATAA GCTCTTAAGT AAATACCAAA GTTACCATAA 3660 

AATGGTTTTA CACGTCCGAT AGAATTTTTA ATGTCATTAT CATATTTAAA TTTGTCGCCA 3720 

45 TCTTTAATAA CCATTGGCTT TGGTAAGTAA CTTGCTAGTT CTTTTACTAC ACCGACTGGA 3780 

CCTGAACCAG GACCGCGACC ACCATGTGGA CCAGTAAATG TTTTATGCAA GTTTAAATGA 3 840 

ACAGCATCAA ATCCCATATC TCCTGGGCGA ACTTTGTCCA TAATAGCGTT TAAATTCGCA 3 900 

SO 

CCATCATAAT ATAATAGACC ACCAGCATTA TGGACGATTT CACGGATTTC CATAATATTT 3960 

TTTTCGAAAA TACCTAAAGT GTTTGGATTA GTTAACATAA TAGCTGCTGT ATTTTCATTT 4 020 
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GATTTAAATC CTGCAAATGa AGCTGAGGCT GGaTTCGTAC CATGCGCAGA ATCTGOcACA 414 0 

ATGACTTCAT CACGATGACC TTCACCATTA TTCTCATGGT AAGCTTTAAA TATCATCAAT 42 00 

5 GCAGTCCATT CACCATGTGC GCCAGCAGCT GGTTGTAATG TCACCTCATC CATACCAGTA 4 260 

ATTTCTTTTA ATTCTTCTTG CAAACTATAA ATAATTTCTA ATGAACCTTG AACTTGATCT 4 320 

TCATCTTGTA ATGGATGTGA TTCACTAAAT CCTGGTATTC TAGCAACCTT TTCATTAATT 43 80 

10 

TTAGGGTTAT ACTTCATCGT ACATGAACCC AATGGATAAA ATCCGTTGTC TACACCGAAA 444 0 

TTTTTATTTG AAAGTTCAGT ATAATGACGT ACTAAGTCTA GTTCAGCAAC TTCAGGAAAC 4 500 

TCCGCTTTGT TTTTACGAAT AAATTTATCA TCTAACAATG ACTCAACAGA ATTTGTTTTA 4 560 

15 

ATATCACTTT TTGGTAATGA AT ATG CAT AT CTGCCTTCAC GAGATCTTTC AAAAATTAAT 4620 

GGACTTGATT TACTAGTCAT TTAACTCACC AGCCTTTTCT ACAAATGTAT CGATTTCATC 4 680 

20 TTTTGTTCTT AATTCAGTTA CAGCTATTAA CATGTGATTT TTAAAGTCGT CTGAAACAAC 4 74 0 

ACCTAAATCA AAAC CACCG A TAATATTQTA CTTCACTAAT TCCTGGTTAA CTTGTTGAAT 4 800 

TGUTTTGTCA AATTTGACTA CAAACTCATT GmnAAGnTGT ACCATCTAAT ACTTCAAAAC 4 860 

25 CTTTTTTAAT AAATTGTTGT TTAGCATAGT TAG CATGTTC TATATTTTGA ACTGCAATAT 4 920 

CATAGATACC TTGTTTACCA AGTGCTGACA TTGCAATTGA TGaCGcTAAA GCATTTAATG 4 980 

CTTGGTTAGA ACAAATATTA GATGTCGCTT TATCGCGTCG AATATGTTGT TCACGTGCTT 504 0 

•art 

GTAATGTTAA TACAAAGCCA CGATTACCTT CATCATCTTG TGTTTGACCG ACTAATCTAC 5100 

CTGGCACTTT ACGCATTAAC TTTTTCGTCG TTGCAAAATA TCCACAATGT GGCCCACCGA 5160 

ATTGAGCAGG AATTCCGAAT GGCTGAGTAT CACCTACAAC AATATCTGCA CCAAATGAAC 5220 

35 

CTGGAGGTGT AAGTAATCCC AATGCTAATG GATTTGCATA TACGATAAAT AATGCTTTTT 5280 

TATCFTCAAT AAAGCTATGA ATCTTTTCAA GATCTTCAAT TGAACCGTAA AA G TTTGGAT 534 0 

ATTGTACTGC AACAGCTGCT GTTTCATCAT CCACTGCTGC TTCTAATTTT TTCAAATCTG 5400 

40 

TAACAGTGCC ATCTAAATCG ATTTCCACTA CTTCGAATTC CTTACGCGTC TT AG CAT AAG 54 60 

TATGAAGTAC TTGTAATGCT TGATAATGTA AACCTTTTGA GACTACAATT TTATTTTTCT 5520 

45 TTGTTTGACT AAATGCTAAG ATACATGCTT CAGCAAAGCT AGTCATCCCA TCATACATAG 5580 

AAGAATTTGC TACATCCATA TCTGTTAATT CACAAATTAA AGTTTGGAAC TCAAAAATGG 564 0 

CTTGTAATTC ACCTTGAGAA ATTTCCGGTT GATATGGCGT ATATGCTGTG TAAAATTCTG 570 0 

50 ATCTTGAAAT CATAGCATCC ACAACTGATG GCGCGTAATG ATCATAAACA CCAGCACCCA 576 0 

rAAATGATGT ATGCGTTTCT TTAGTGATAT tCTTGCTkGC AATGGGGATT TAAACnTCTA 5820 

55 
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(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY: linear 



10 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 





ATnATAATTG 


GCTTTGCTAA 


TAATTACTTC 


CCTGAATTAC 


aAGTATTAGC 


AAACGAAATA 


60 


15 


AAATCTGATA 


TGGCTAGTTC 


ATTAAAACAA 


TGATATTTTT 


ATTTAAATTT 


TTaAAGCTTT 


120 


GTACGAAATT 


GTACAAAGCT 


TTTTTGGTGC 


GTATTGTATG 


GGCAACAACT 


TGACGATGAA 


180 




AATCCGTTAC 


AGGATTGGTA 


ATAGGAAATG 


TTAGCGAAAG 


ACAAGGGTAT 


CCATTGTAGA 


240 


20 


TTAACAAAAG 


GACGTTTCCA 


CAAGTGTGGG 


TTATTCTCAC 


TAAAGCAATA 


CGCAGAGACA 


300 




ACTTACGTAA 


AATTTTGAAC 


TGACTAGAAC 


GGAACTTCTA 


CTCAATTATT 


GATAAAAATT 


360 




TTCAAAAAGA 


CTTGAATGTG 


CTGAGAATAC 


GAAGTTTATG 


GAAGGATTAT 


CAAAATATAA 


420 


25 


ATGTGCATTC 


ATTTACAACC 


TTTATTGACA 


ATGATTCTCA 


ACTAATATA3B 


TATATAATCA 


480 




AATCGTAATA 


GTTACGATTT 


GTTTTCTGCA 


ACTTTTTTGA 


AGTTTTAGTT 


GAGGTGAAAA 


54 0 




CAATAAAAGC 


ATCTAAGTGA 


ATGTAGTTAA 


CGGACAACTG 


CATTCGCTTG 


TAGAGCCACA 


600 


30 


AG AAG CAACT 


TTAAATAAGG 


TTTACGGTTG 


CATTTTGATA 


CAACAACCGA 


TTACTAAGTC 


660 




ATGCTTTCCA 


CTTTGCGGGT 


TAGCATGACT 


TACCTAATAG 


ATAGAGCTAT 


TAGGTTCAGC 


720 


35 


TTCTAAAAAA 


TTACAGTTTT 


AGAGGAATAC 


AGTTGcTTGc 


tTCGCAACAA 


CTGCATAAGA 


780 


GCCATGGTTT 


TCGCTTTTGC 


GAATTAGCAT 


GACTTACCTA 


CTAGATAGAG 


CTATTAGGTT 


840 




CATCTTCTAA 


AAAATTACAG 


GTTTAGAGGA 


ATACAGTTGT 


TTGcTTCGCA 


ACAACTGCAT 


900 


40 


AAGAGCCTCT 


AGTAATTAAA 


ATTACAGAGG 


CTCTAAAAAT 


ACATCTAAAG 


GAGTGTCGTA 


960 




TGAATCGGCA 


GGTTATAGAA 


TTTTCTAAGT 


ATAATCCTTC 


GGGGAATATG 


ACGATACTTG 


1020 




TTCATTCAAA 


ACATGATGCT 


AGTGAATATG 


CATCTATCGC 


CAATCAGTTG 


ATGGCCGCAA 


1080 


45 


CACATGTATG 


CTGTGAACAG 


GTAGGCTTTA 


TAGrATCAAC 


ACAAAATGAT 


GATGGTAATG 


1140 




ATTTTCACTT 


AGTTATGAGC 


GGTAATGAAT 


TTTGCGGTAA 


TGCGACGATG 


TCATATATAC 


1200 




ATCATTTGCA 


GGAAAGTCAT 


TTGCTTAAAG 


ACCAACAGTT 


TAAGGTGAAG 


GTGTCTGGCT 


1260 


SO 


GTTCGGATTT 


AGTGCAATGC 


GCAATTCATG 


ATTGCCAATA 


CTATGAAGTT 


CAAATGCCAC 


1320 




AAGCCCATCG 


TGTTGTGCCA 


ACAACAATTA 


ATATGGGTAA 


TCATTCATGG 


AAAGCAATAG 


1380 
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TTCAACATTT GGTTGAAGCG TTTGTGCGTG AgcAACAATG GAGTCACAAA TATAAAACAG " 1500 

TAGGTATGAT GCTTTTTGAT GAACAACGTC AATTTTTACA GCCATTAATC TATATACCAG 1560 

5 AAATTCAAAG TTTAATTTGG GAAAATAGCT GTGGTTCTGG TACAgcATCA ATTGGGGTTT 1620 

TTAATAATTA TCAACGTAAT GACGCATGCA AAGATTTTAC AGTACATCAG CCAGGGGGCA 1680 

GTATTTTAGT GACATCAAAG CGATGTCATC AATTGGGATA TCAAACTTCA ATTAAAGGAC 174 0 

10 

AGGTTACAAC TGTAGCTACA GGaAAAGCAT ATATAGAATA AGGAGCCTAC AATGAATAAC 1800 

TTTAATAATG AAATCAAATT GAT ATT A CAA CAATATTTAG AAAAGTTTGA AGCGCATTAC 186 0 

GAGCGTGTAT TACAAGACGA TCAATATATC GAAGCATTAG AAACATTGAT GGATGACTAT 1920 

15 

AGTGAATTTA TTTTAAATCC TATTTATGAA CAACAATTTA ATGCTTGGCG TGACGTTGAA 1980 

GAAAAAGCAC AATTaATAAA ATCACTGCAA TATATTACAG CGCAGTGTGT TAAACAAGTG 2040 

GAAGTCATTA GAGCGAGACG TCTATTAGAC GGACAGGCGT CTACCACAGG TTACTTTGAC 2100 

20 

AATATAGAAC ATTGTATTGA TGAAGAGTTT GGACAATGTA GTATAGCTAG CAATGACAAA 2160 

TTATTGTTAG TTGGTTCAGG TGCATATCCA ATGACGTTAA TTCAAGTAGC AAAAGAAACA 2220 

25 GGTGCTTCAG TTATCGGTAT TGATATTGAT CCACAAGCCG TTGACCTAGG GCGCAGAATC 2280; 

GTTAACGTCT TAGCACCAAA TGAAGATATA ACAATTACGG ATCAAAAGGT ATCTGAACTT 2340 

AAAGATATCA AAGATGTGAC GCATATCATA TTCAGCTCGA CAATTCCTTT AAAGTACAGC 24 00; 

30 ATTTTAGAAG AATTATATGA TTTAACAAAT GAAAATGTCG TAGTTGCAAT GCGCTTTGGT 24 6 0 

GATGGCATCA AAGCAATATT TAATTATCCG TCACAAGAAA CAGCGGAAGA TAAGTGGCAA 2520; 

TGTGTGAATA AACATATGAG ACCACAGCAA ATTTTTGATA TAGCACTTTA TAAAAAAGCA 2580; 

35 

GCTATAAAGG TAGGTATTAC GGATGTCTAA ATTATTAATG ATAGGCACTG GTCCgGTCGC 2640- 

AATGCAATTA GCGAATATTT GCTATTTAAA AT CAGATT AT GAGATTGATA TGGTTGGACG 2700 

TGCCTCAACA TCAGAAAAAT CAAAACGCTT ATATCAAGCG TATAAAAAAG AGAAACAATT 2760 

40 

TGAAGTCAAA ATACAAAACG AGGCGCATCA ACATCTGGAA GGTAAGTTTG AAATTAATCG 2 820 

TTTGTATAAA GATGTTAAAA ACGTTAAGGG TGAATACGAA ACGGTTGTCA TGGCATGCAC 2880 

4S AGCAGATGCT TATTATGACA CACTACAGCA ATTGTCGTTA GAAACTTTGC AAAGTGTCAA 2 94 0 

ACATGTCATT TTAATATCAC CGACATTTGG TTCGCAAATG ATTGTCGAAC AATTTATGTC 3 000 

TAAATTTAAT AAAGATATCG AAGTGATTTC ATTCTCAACT TATCTTGGCG ATACACGTAT 3 060 

50 TGTTGATAAA GAAGCGCCTA ATCATGTGTT GACAACAGGT GTAAAAAAGA AATTGTACAT 3120 

GGGATCGACA CATTCAAACT CAACAATGTG TCAACGAATC TCTGCTTTAG CTGAGCAATT 3180 
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TTATGTGCAC 


CCACCACTAT 


TTATGAATGA 


CTTTTCATTG 


AAAGCCATTT 


TCGAAGGAAC 


3300 




AGATGTACCG 


GTTTATGTGT 


ATAAGTTATT 


TCCTGAAGGA 


CCGATAACGA 


TGACACTAAT 


3360 


5 


CCGTGAAATG 


CGTTTAATGT 


GGAAGGAAAT 


GATGGTTATT 


TTACAAGCAT 


TTAGAGTGCC 


3420 




GTCAGTCAAC 


CTGCTTCAAT 


TTATGGTGAA 


GGAAAATTAT 


CCAGTACGTC 


CTGAAACTTT 


3480 


10 


GGATGAAGGT 


GATATTGAGC 


ATTTCGAAAT 


CTTGCCAGAT 


ATCTTACAAG 


AATATCTGCT 


3540 


TTATGTAAGA 


TATACCGCAA 


TCCTCATTGA 


TCCATTTTCA 


CAGC CAGACG 


AAAACGGACA 


3600 




TTACTTTGAT 


TTTTCAGCTG 


TACCATTTAA 


GCAAGTCTAT 


AAAAATGAAC 


AGGATGTTGT 


3660 


15 


TCAAATTCCA 


AGAATGCCAA 


GTGAAGATTA 


TTACAGAACG 


GCGATGATTC 


AGCATATTGG 


3720 




GAAAATGCTA 


GGTATCAAAA 


CGCCAATGAT 


TGAT CAGTTC 


CTAACTCGCT 


ATGAAGCAAG 


3780 




TTGCCAGGCG 


TACAAGGATA 


TGCATCAAGA 


TCAACACTTA 


TCTTCTCAAT 


TTAATACAAA 


3840 


20 


TCTATTTGAA 


GGAGATAAAG 


CACTCGTCAC 


aaaatttttg 


GAAATCAATA 


GAACGCTTTC 


3900 




ATAATAAGGG 


TTTGAAGTTT 


TATAATAGAA 


AAAAATTATT 


GAATTATGTT 


TGACATTTAC 


3960 




ATAAAAATAA 


GCAAATAATT 


GAGAAAAATA 


ATCATTACGA 


TTTGATTAAG 


TAATGCAACT 


4020 


25 


TATCAATTTA 


GAAAGAGGAA 


AAGCAAATGA 


GAAAACTAAC 


TAAAATGAGT 


G CAATGTT AC 


4080 




TTGCATCAGG 


GCTAATTTTA 


ACTGGTTGTG 


GCGGTAATAA 


AGGTTTAGAG 


GAGAAAAAAG 


4140 




AAAACAAGCA 


ATTAACGTAT 


ACGACGQTTA 


AAGATATCGG 


TGATATGAAT 


CCGCATGTTT 


4200 


30 


ACGGTGGATC 


AATGTCTGCT 


GAAAGTATGA 


TATACGAGCC 


GCTTGTACGT 


AACACGAAAG 


4260 




ATGGTATTAA 


GCCTTTACTA 


GCTAAAAAGT 


GGGATGTGTC 


TGAAGATGGG 


AAGACATACA 


4320 


35 


CGTTCCATTT 


GAGAGATGAC 


GTTAAATTCC 


ATGATGGTAC 


GCCATTTGca 


TGctGACGCA 


4380 


GTTAAGAAAA ATATTGACGC AgTTCAAGAA AACAAAAAAT 


TGCATTCTTG 


GTTAAAGATT 


4440 




tcgAcattaa TTGACAATGT TAAAGTTAAA GATAAGTACA CGGTTGAATT GAATTTGAAA 


4500 


40 


GAAGCATATC 


AACCTGCATT 


GGCTGAATTA 


GCGATGCCTC 


GTCCATATGT 


ATTTGTGTCT 


4560 




CCAAAAGACT 


TTaAAAACGG 


TACAAcAAAA 


GATGGCGTTA 


AAAAGTTCGA 


TGGTACTGGT 


4620 




CCATTTAAAT 


TAGGTGAACA 


CAAAAAAGAT 


GAGTCTGCAG 


ACTTTAACAA 


AAATGATCAA 


4680 


45 


TACTGGGGCG 


AAAAGTCTAA 


ACTTAACAAA 


GTACAAGCAA 


AAGTAATGCC 


TGCTGGTGAA 


4740 




ACAGCATTCC 


TATCAATGAA 


AAAAGGTGAA 


ACGAACTTTG 


CCTTCACAGA 


TGATAGAGGT 


4800 




ACAGATAGCT 


TAGACAAAGA 


CTCTTTAAAA 


CAATTGAAAG 


ATACAGGTGA 


CTATCAAGTT 


4860 


50 


AAGCGTAGTC 


AACCTATGAA 


TACGAAAATG 


TTAGTTGTCA 


ATTCTGGTAA 


AAAAGATAAC 


4920 




GCTGTGAGTG 


ACAAAACAGT 


CAGACAAGCG 


ATTGGTCATA 


TGGTAAACAG 


AGATAAAATT 


4980 
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ACAGACATTA 


ATTTCGATAT 


GCCAACACGT 


AAGTATGACC 


TTAAAAAAGC 


AGAATCATTA 






TTAGATGAAG 


CTGGTTGGAA 


GAAAGGTAAA 


GACAGCGATG 


TTCGTCAAAA 


AGATGGTAAA 


516 0 


5 


AACCTTGAAA 


TGGCAATGTA 


CTATGACAAA 


GGTTCTTCAA 


GTCAAAAAGA 


ACAAGCAGAA 


5220 




TACTTACAAG 


CAGAATTTAA 


GAAAATGGGT 


ATTAAGTTAA 


ACATCAATGG 


CGAAACATCA 


5280 


10 


GATAAAATTG 


CTGAACGTCG 


TACTTCTGGT 


GATTATGACT 


TAATGTTCAA 


CCAAACTTGG 


5340 


GGATTATTGT 


ACGATCCACA 


AAGTACTATT 


GCAGCATTTA 


AAGAGAAAAA 


TGGTTATGAA 


54 00 




AGTGCAACAT 


CAGGCATTGA 


GAACAAAGAT 


AAAATATACA 


ACAGCATTGA 


TGACGCATTT 


54 60 


IS 


AAAATCCAAA 


ACGGTAAAGA 


GCGTTCAGAC 


GCTTATAAAA ACATTTTGAA ACAAATTGAT 


5520 




GATGAAGGTA 


TCTTTATCCC 


TATTT CACAC 


GGTAGTATGA 


CAGTTGTTGC 


ACCaAAAGAT 


5580 




TTAGAAAAAG 


TATCATTCAC 


ACAATCACAG 


TATGAATTAC 


CATTCAATGA 


AATGCAGTAT 


5640 


20 


AAATAAAGGA 


GCAATTAGAT 


GTTCAAATTT 


ATCTTAAAAC 


GTATTGCGCT 


CATGTTTCCA 


5700 




TTGATGATTG 


TAGTAAGTTT 


TATGACATTT 


CTATTGACGT 


ATATTACAAA 


TGAAAATCCA 


5760 




GCTGTGACAA 


TTTTACATGC 


ACAAGGGACG 


CCAAATGTAA 


CACCAGAGTT 


GATTGCAGAA 


5320 


25 


ACGAATGAGA 


AGTACGGTTT 


CAATGATCCA 


TTATTAATTC 


AATATAAAAA 


TTGGTTACTT 


588 0 > 




GAAGCGATGC 


AATTTAATTT 


TGGTACAAGC 


TACATTACAG 


GTGACCCAGT 


TGCTGAACGT 


594 0 - 




ATTGGTCCAG 


CATTTATGAA 


TACATTGAAA 


TTAACAATAA 


TTTCAAGTGT 


TATGGTGATG 


6000 • 


30 


ATTACATCAA 


TTATTTTAGG 


TGTAGTTAGT 


GCATTAAAAA 


GAGGAAAGTT 


CACTGATCGT 


6060- 




GCGATACGTT 


CAGTGGCTTT 


CTTTCTAACT 


GCATTACCAT 


CATATTGGAT 


AGCTTCAATA ' 


6120^ 


35 


CTTATTATTT 


ACUTTTCAGT 


GAAGTTAAAC 


ATATTGCCGA 


CTTCTGGATT 


AACAGGTCCA 


6180 ; 


GAAAGTTACA 


TATTGCCAGT 


GATCGTTATT 


ACGATTGCCT 


ATGCTGGTAT 


TTACTTTAGA 


6240* 




AATOTTAGAC 


GCTCGATGGT 


GGAACAATTA 


AATGAAGATT 


ATGTACTTTA 


TTTAAGAGCA 


6300 


40 


AGCGGTGTGA 


AATCTATCAC 


ATTAATGTTG 


CATGTGTTGC 


GTAATGCTTT 


ACAAGTTGCG 


6360 




GTATCAATCT 


TTTGTATGTC 


TATACCAATG 


ATAATGGGTG 


GACTAGTTGT 


TATCGAGTAT 


6420 




ATCTTTGCAT 


GGCCTGGACT 


AGGTCAATTA 


AGTTTAAAAG 


CAATACTTGA 


ACACGATTTT 


6430 


45 


CCAGTCATTC 


AAGCATATGT 


ATTAATTGTA 


GCGGTATTAT 


TTATTGTATT 


TAATACATTA 


6540 




GCAGATATCA 






AGATTAAGGG 


aGGGCGCACG 


ATGATAATTT 


6600 




TAAAmCGATT 


ATTmCArGwT 


AAAGGTGCAG 


TAATTGCTTT 


AGGCATTATT 


GTATTATATG 


6660 


SO 


TCTTTTTAGG 


ATTAGCAGCA 


CCACTTGTGA 


CATTTTATGA 


TCCTAACCAT 


ATCGATACAG 


6720 




CAAACAAATT 


TGCTGGCATG 


AGTTTTCAAC 


ATCTACTAGG 


TACTGACCAT 


TTAGGTAGAG 


6780 
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TATTTGTTTC TGTACTTATT GGATCTATTT TAGGATTCTT ATCAGGATAT TTCCAAGGGT 6 900 

TTGTTGACGC CTTAATCATG CGTGCGTGTG ATGTTATGTT GGCATTCCCA AGTTATGTTG 6960 

TAACGTTAGC ATTAATTGCA TTGTTTGGAA TGGGTGCCGA AAAT ATT AT C ATGGCATTTA 7020 

TTTTGACGCG TTGGGCATGG TTCTGTCGTG TTATACGTAC AAGTGTTATG CAGTACACTG 7080 

CTTCTGACCA TGTAAGATTT GCTAAAACAA TCGGTATGAA TGATATGAAA ATTATTCACA 714 0 

AACATATTAT GCCATTAACA TTAGCAGATA TTGCTATCAT CTCTAGTAGC TCGATGTGTT 7200 

CAATGATCTT GCAAATATCT GGCTTTTCAT TTTTAGGATT AGGTGT CAAA GCGCCTACTG 726 0 

CAGAGTGGGG CATGATGCTT AACGAaGCTA GAAAAGTGAT GTTTACACAT CCTGAAATGA 73 20 

TGTTTGCGCC AGGTATTGCC ATAGTGATTA TAGTGATGGC ATTTAACTTC TTATCCGATG 7380 

CTTTACAAAT TGCTATTGAT CCCCGCATCT CTTCTAAAGA TAAACTTCGT TCTGTGAAAA 744 0 

AAGGAGTGGT GCAATCATGA CATTGTTAAC AGTTAAACAT TTGACGATTA CAGATACCTG 7500 

GACAGATCAA CCACTCGTGA GTGATGTGAA TTTTACATTA ACTAAGGGTG AAaCTTTAGG 7560 

CGTTATTGGA GAAAGTGGTA GTGGTAAATC AATCACTTGT AAATCGATTA TTGGTTTGAA 7620. 

25 TCCCGAACGA CTCGGGGTGA CAGGTGAAAT TATCTTTGAT GGTACAtCAA TGTTGTCATT 7680 

ATCTGAATCG CAATTGAAAA AGTACCGTGG TAAAGACATT GCGATGGTCA TGCAACAAGG 774 0 

TAGTCGTGCC TTTGACCCAT CAACTACTGT CGGTAAACAA ATGTTTGAGA CTATGAAAGT 7800 

30 ACATACGTCA ATGTCTACAC AAGAAATTGA AAAGACATTG ATTGAATATA TGGATTATTT 7860 

AAGTTTGAAA GATCCTAAAC GTATATTAAA ATCATACCCT TACATGTTAT CAGGAGGAAT 7 92 0 

GTTACAG CG A TTGATGATTG CTTTAGCGTT AgcTTTgAAA CGAAAGTTAA TCATTGCTGA 7 980 

TGAGCCGACA ACGGCTTTAG ATACAATTAC ACAATATGAT GTACTGGAAG CATTTATAGA 8040 

TATTAAAAAA CACTTTGACT GTG CGATGAT TTTCATTTCA CATGATTTAA CGGTTATTAA 8100 

CAAGATTGCA G AC CGTGTTG TTGTGATGAA AAATGGTCAG CTTATTGAAC AAGGGACACG 8160 

TGAATCAGTC TTGCATCATC CAGAACATGT TTATACGArt ATTkTATTAT CAACGAAGAA 8220 

GAAGATTAAT GATCATTTTA AACATGTGAT GAGGGGTGAT GTACATGATT AAAATTAAAG 8280 

ATGTTGAAAA GTCATATCAA AGCGCACATG TTTTTAAGCG TCGTCGAACA CCTATCGTGA 834 0 

AAGGTGTGTC ATTTGAGTGT CCAATCGGTG CGACGATTGC GATTATCGGA GAAAGTGGTA 8400 

GCGGTAAATC GACGTTGAGT CktATGATAT TAGGTATTGA GAAACCGGAT AAAGGTTGTG 84 60 

SO TAACCTTAAA TGATCAACCG ATG CAT AAGA AGAAAGTGAG ACGTCATCAA ATTGGTGCTG 852 0 

TATTTCAAGA TTATACGTCA TCATTACATC CATTTCAGAC TGTTAGAGAA ATCTTATTTG 85 8 0 
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10 



is 



20 



25 



30 



35 



40 



45 



SO 



TGTTGGAAGA 
GTGGAGAGGC 
TGTTTGATGA 
TGATTCATTT 
CTGCCACGTA 
TTC CGACAAG 
AACTATCATT 
ATATATATTA 
ACGAGGGCAT 
GTTAACAGCA 
TAAAGTATTA 
TGGCTTAGAA 
TATGTCTTTA 
ATCATTGTAC 
TATTTGGAAT 
AACAACATTC 
ATTGAAAAAA 
TTGTTGAACA 
TTTGTACCGT 
CAGGC CATCG 
ATCffiGCATC 
GTGCSCATTTG 
ATGACGCAAG 
GTAGGTCGTA 
GGTGGCGCAT 
TGTGGTATGT 
CCAGCGAGTA 
TTAAAAGGTA 
TATGGCTCGT 



AGTCGGTCTA 
GCAACGTGTT 
AGCCATTAGT 
ACGTGAAACG 
TTTATGTGAT 
CGCATTGCAT 
CTAAGGAGTG 
ACATTGATGT 
GATTTAGGCG 
ATGGTATTTC 
AGAATTATTT 
GGTTATTTCG 
CAGCTAGGTA 
TCGCTATTTT 
GCAAATAATA 
TTTGsTATCG 
TGCCGTTTAA 
GTGGTATTAT 
TATACACAGT 
CAGTTGTTGC 
CTAAATATAT 
GTCCGCAAGT 
CAATGGTGTA 
ATATGTTGTT 
TGATGGGACC 
TAGTCATTGT 
GTCTTTAATG 
TATTGaGCAT 
TTTTAGAACG 



TCTAAGGCAT 
GCGATTGCGC 
TCACTCGACA 
CGTCAGTTGA 
CAATTAATTA 
AAAAGTGACA 
AGATAATGAA 
TCTTTAGTGC 
CAACGAATAC 
GACCATGGGC 
TGATTATCAA 
TAGCACGTGT 
TTATTGATGC 
CAACGATTCC 
TTTCACTATT 
CGTGACCTTT 
CGCTGTAACT 
CATGATTGTT 
GAGTTTAGGA 
GGCAAGATTT 
GGTATCTGTA 
AGGTGCAATT 
CCCAACATTA 
AGGTTTATTT 
TATTTCCGAT 
AATAATGATT 
AAGTGAATTA 
GGCGATTCAT 
ACAtATATCT 



ACATGGATAA 
GTGCAATATG 
TGTCAATTCA 
GTTATATTTT 
TTTTTAAAAA 
ATGCTTATAC 
AGGTGCAATG 
CAATGCAATC 
GGTTATCGGT 
AGGACAAATT 
TGCCATAGCT 
TATGCAAGGT 
ATTACCAGAG 
AAACTTAATC 
TGCAATTGTC 
GCTGAACAGG 
GTTTTTGCGC 
GCATCGATTG 
TTCGCGAATG 
TACTTAAGGA 
CTATCATTAT 
ATTTTCTATG 
ACATCATACT 
ATTGCCTGTG 
TTAGTAGGAT 
ATGAGTTTCT 
AAG CATATT A 
GTGCTTCATG 
AAATAAAGCA 



ATATCCTAAT 
TATTAACCCT 
AACACAAATA 
TATCACACAT 
CGGAAAAATA 
AAGAGAATTA 
GCTTGGCCCT 
TTAAACGTGT 
ATCGTTATGG 
ATTGCTCGTG 
TTAATTATTT 
GTGTGTAGGG 
GAACATCGTT 
GGACCATTAG 
ATTATCTTTA 
AACCCGATAC 
AATTTTTCAA 
TATTTGGTGC 
CGGGAATCTT 
AATACATTCC 
TAGTAATCGC 
GTAGTGCGAT 
TAAGCTTCGT 
CAGACTTAGG 
TTAAATGGAT 
TGAAAAAGCC 
AGTTAATGAA 
CTAGGACATG 
CGCTTArAAG 



ATGTTATCAG 
AAATATATTT 
TTAGATTTAT 
GATATTCAAG 
GAAGAACAAA 
ATAGAAAAAC 
TTTTGAGATT 
TTATACCTTT 
GGGCATACAT 
TCGGTCCCAT 
ATGGTTTTAC 
CATTCTTTTC 
CTGAAGGTGT 
TTGCCGTAGG 
TCGCATTAAC 
GTCAGATAAG 
AAATAAAGAG 
AGTTAGTACA 
TTTGACAATA 
GTCAGATGGT 
GTCATTTGTA 
ATTAATAGGA 
CTTACCAAAA 
TATATCGTTA 
GTATCTAATT 
AACACCACGT 
TATTTAAATT 
AAACATTCTA 
TGAGTTTTGA 



8700 
8760 
8820 
8880 
8940 
9000 
9060 
9120 
9180 
9240 
9300 
9360 
9420 
94 80 
9540 
9600 
9660 
9720 
9780 
^9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
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TTACATGAAA 


ATATGCAAAA 


CGAGTATAAC 


TGCTAATTGA 


TAGAAATAGC 


TCACCATAAA 


10500 




ATTACGGTAT 


GATTTTAAAT 


ATAAGTAAGT 


CGCACTACCT 


GCTAGTATCA 


ATGCTGGAAT 


10560 


5 


GAATTCCCAC 


CATGTATTAA 


TGTATGGATA 


GTAGAACAGA 


GTTTCAAGGA 


TAATGGACAA 


10620 




TACTATTGTA 


ATCTTTAAAG 


GTATTAATCT 


GCTTAATTCT 


TGAATTAAAA 


TATGACGGAA 


10680 




AATAAGTTGA 


CAAATCAAAG 


TATTTAATAT 


AATGGTTAAC 


GAAAATATAG 


CTATTAAACT 


10740 


10 


GATGGAaCCA 


TACCCTTTAA 


TGAGCGGGTA 


AATGTCAAAG 


ACAGTAAAGG 


AATCTACATT 


10800 




TAGTGCGAAA 


ATATTGAAAT 


GATTTAAAAG 


TAAAAAGAGT 


ACGACACTTA 


GTGTAAATGA 


10860 


15 


TATAAGAATA 


TGCCATTTAT 


ATTTAGCACT 


AGCAACGATT 


TGCGAACGTA 


TCATTGGAAT 


10920 


AAACGCATCT 


TCATGCATCA 


GACGAAAAAT 


AGCTAGTGAA 


ATAATAACTG 


CGAGTAAATA 


10980 




GCTAATGTTC 


ATTGAAATAG 


GAAAAGAGAA 


ACCCCACGGA 


GCTTGTTGAG 


TGAATACAGC 


11040 


20 


TACTAACCCA 


AAAGTTAAAA 


AGACGATAAT 


GATCGGCAAG 


ATGTTAACCA 


AAAATATGTA 


11100 




AAGGAAAATA 


AATCCAATAT 


CACGTTTGAA 


AAAACGCGAT 


TGTTCGGTAG 


CGTATTCTTC 


11160 




TTCTATGTAA 


TGTTTATTTG 


TATTTGACAT 


AGTATACCTC 


TTAAATAGTT 


GTATTATATA 


11220 


25 


GATACTTTAG 


CACATATTAC 


TTTGTATTGT 


ATGTTTTATA 


CATTAAAATT 


TAAAATGAAA 


11280 




AACATATCAT 


AAAATTGTTT 


TATAAAATGA 


AGCGCTTCCA 


TTGTGTTTTG 


TTTTGTAAGG 


11340 




TGTATCATAA 


ATATTGAATT 


GAAATTTTGG 


GGGGAGGTAT 


TGTAATGACG 


TTTCTTACAG 


11400 


30 


TCATGCAATT 


TATAGTTAAC 


ATTATCGTTG 


TAGGATTCAT 


GCTTACGGTT 


ATTGTTATCG 


11460 




GGCTTATTTG 


GTTAATTAAA 


GATAAAAGAC 


AATCACAACA 


TAGTGTATTA 


AGGAATTATC 


11520 




CTTTACTAGC 


ACGTATTAGA 


TATATTTCAG 


AAAAAATGGG 


ACCGGAATTA 


CGTCAGTATT 


11580 


35 


TATTTTCTGG 


GGATAATGAA 


GGGAAACCTT 


TTTCACGTAA 


TGATTATAAA 


AATATCGTTT 


11640 




TGGCfiGGAAA 


ATATAACTCT 


CGTATGACCA 


GCTTCGGTAC 


TACTAAAGAT 


TATCAAGACG 


11700 


40 


GCTITTACAT 


ACAGAACACA 


ATGTTTCCGA 


TGCAACGTAA 


TGAGATTTCA 


GTAGATAATA 


11760 


CAACATTGTT 


ATCAACATTC 


ATTTATAAAA 


TCGCGAATG A 


GCGTTTATTT 


AGTCGTGAAG 


11820 




AATATCGTGT 


GCCGACAAAG ATTGATCCGT ATTACTTAAG TGATGAC CAT GCAATAAAAT 


11880 


45 


TAGGTGAACA 


TTTAAAACAT 


CCATTTATTT 


TAAAACGTAT 


CGTAGGACAA 


TCTGGTATGA 


11940 




GTTATGGCGC 


TTTAGGAAAA 


AATG CCATTA 


CAGCTTTATC 


TAAAGGTCTA 


GCTAAAGCGG 


12000 




GCACTTGGAT 


GAATACAGGT 


GAAGGTGGCT 


TATCAGAATA 


TCATTTAAAA 


GGTAATGGGG 


12060 


SO 


ATATCATTTT 


CCAAATTGGT 


CCCGGTTTAT 


TTGGTGTTCG 


TGATAAAGAA 


GGTAATTTTA 


12120 




GTGAAGGTTT 


ATTTAAAGAG 


GTTGCACAGT 


TATCTAACGT 


ACGCGCATTT 


GAGCTGAAGT 


12180 
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TTGCTAAAAT 


CCGAAATGTT 


GAACCTTATA 


<r\svw*4"v\x 


X i nnv. 


W\> 1 1 ALUAA 1 


12300 






TTATTCATAA 


TGCTGAAGAT 


TTGATTCGTT 


TCGTCGATCA 


GTTGCAGCAA 


X X AVJU 1 V.JWtA 


123 60 




5 


AACCAGTAGG 


ATTCAAAATT 


GTAGTAAGCA 


AAGTTTCAGA 


AATTGAAACA 


rTTnTAPnTA 
v. i ivj 1 x M 


124 20 






CGATGGTGGA 


ACTAGATAAG 


TATCCAAGCT 


TTATTACGAT 

x x r% X X^\WW^^X 


TGATGGTGGT 

X VJX> X VJ>J X W X 




124 80 




10 


CTGGTGCAAC 


ATTCCAAGAA 


TTACAAGATG 


GTGTTGGCTT 

X VJ X X WV X X 


AC CG CTATTT 

»1W W S* X «» xxx 


APAnrrrTAr 


12540 




CTATTGTGTC 


TGGCATGTTA 


GAAAAATATG 


GTATTCGAGA 


TAAAGTRAAA 


1 ibuLuuLAl 


12600 






CTGGTAAGTT 


AGTGACACCA 


GATAAAATTG 


CG ATTG CA CT 


xxx nvj^j x 


uw\unl lllu 


12660 




15 


TAAATATCGC 


ACGTGGGATG 


ATGATTAGTG 


X CUVI X X w X S\ x 




f*A A1*r^*T , ^lV^lV 
V-AA 1 1 LALA 


12720 






TGAATACGTG 


TCCTGTAGGT 


GTTGCAACGA 


CAGATGCGAA 




tjV— rVl luAi IX* 


12780 






TTGGAGAAAA 


GCAATATCGT 


GTCACAAACT 


ATGTAACAAR 
x w x nnwvuj 


Tnr r mr , ATY^ a a 

x x loUnlvAn 


1 1 Ax x wA 


12840 




20 


AT ATTG CAGC 


AG CTGTTGGC 


GTATCCAGTC 


CTACAGAAAT 


T ACTY2 C^TCZ AT 

X X UV. X V2/\ X 


r* A T A *P*TY*T 21 T 
v-nlAl 1\j1A1 


12 900 






ATCGAAAAGT 


CGATGGTGAG 


TTACAAACGA 


x nv__rt. x X x « 


TAAATTAAAB 


w 1 LAITAGTT 


12960 






AACTTAATTA 


TTTCGGGAAA 

x x x vwvMvm 


TTGAAAfZCAfi 


wvVjtA.1 X X liiVj 




AATAATTTTA 


13020 




25 


TATTAGTAGT 


GGATGCTGGT 




TTTrAAATAT 
\_ X X %»n/v\X aX 


TAAAftorrrr 
x nmwvvw ^ v» 


AUMiATuAA 


13080 






TTAAGGTTTG 


TAACCTTAGT 


CTTATCTGAG 


GGCATTTTTA 

w\J\v^AX XXX X 


AOTTATAAAr 


Inil lV»lV_.Vjrl 


13140 






CCATTTTATC 


TTTTTCTTTT 




wn— X X X#W\X XVJ 


Willi Liinu X 


111 1 LAAAAL 


13200 




30 


TAATATCTTT 


A" T TTT CTTTA 


GTCGAAACAC 


C A IV PJ 21 CPiTTT 


ni 1 innl 111 


1 1 vJATGTCAA 


13260 






CTTCTGTGTA 


ATCTATGTCT 


AAGTGvTCAA 


X X w\« X X X X X X 


n x w x x x n x nu 


lv.1 ML X 1 1 V? 1 


1 3 J2U 




35 


ATTTTACGCC 


T7TAAGGTCT 


TTGAAAATAC 


TTTCAGATTT 
xxx wiun xxx 


GGCGAATAAC 


1111 IvuWl 1 


i nan 
13 JoU 




CGTCTTTATC 


CATACCTAGA 


TCGTCATATT 


X m*T\ X X Uiw X X 


GATTGTAGAC 
vx#% x x x nun w 


TfTrTTT^A AAA 
Xwl 1 1 XnMMn 


1344U 






CTTfATCATC TTTATATGTG 


ATAGAAGTTA 


vj x nwi x xxx 


A C CACT AACA 


1 WACWwl Wnl 


1 3 3UU 




40 


ATGTTTTGGT 


TTGTTCTTTA 


CCACAAGCTG 


ATAATGCAAT 


GATACAAACT 


AATGCTACTA 
nn x u w x x n 


1 7 ft 
1 3 SO U 






CAATTAATGA 


ACATAATTTT 


TTCAAAGTCA 


GTCGC CTTCT 


TTCGATATTT 


GTATTATAAA 
vj xn x x /a x £Wv% 


1JOZU 






GAAATTATAA 


CATTTACTAA 


AAAATGATGT 


TATTCAAAAA 


TTTAAATfTT 

X X xxx 


V9 1 W/\l 11111 


1JOOU 




45 


TTGAAGATAT 


GAGTTTTTTT 


AAG CGGATTC 


CTCACAAAAT 


TTTAAAAATA 
x x x n/wv\ X f\. 


1 1 1 AAVaWW 1 />. 


1 3 / 4 w 






AAAATGATAA 


AGCGkTAGGG 


AACGTTTTTC 


TGAAAGTTAG 


TGATACAATA 


GTTTTAAGTT 


13800 






GAAATACAGG 


AGGATGAATA 


ACATGAATCA 


GTCAGTCAAA 


TTACTTAAAC 


ATTTAACAGA 


13860 




SO 


TGTAAACGGC 


ATTGCTGGTT 


ATGAAATGCA 


AGTTAAAGAA 


GCAATGCGTa 


ACTATATAGA 


13920 






GCCTGTCAGT 


GATCAAATTA 


TTGAAGATAA 


CTTGGGTGGC 


ATTTTTGGAA 


AGAAAAATGC 


13980 
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AACAAAGATT 


GATAAACATG 


GTTTTATTTC ATTTACGCCA kTgGTGGATG GTGGAATCAA 


14100 




GTCATGCTAT 


CTCAAAAAGT 


AACGATTACA 


ACAGATTCGG 


GCAAAGAAAT 


TAGAGGTATC 


14160 


5 


ATCGGTTCTA 


AACCGCCACA 


TGTCTTAACG 


CCTGAAGAAC 


GTAAAAAGCC 


AATGGAAATC 


14220 




AAAAATATGT 


TTATAGATAT 


TGGTGTTAGT 


AGCAAGGAAG 


AAGCTGAAGA 


AGCTGGCGTT 


14280 


10 


GAAGTAGGCA 


ATATGGTTAC 


GCCATATAGT 


GAATTTGAAG 


TGCTTGCAAA 


TGATAAATAT 


14340 


TTAACTGCGA 


ArCATTTGAT 


AATCG CTATG 


GCTGTGCATT 


AGCTATTGAG 


GTATTAAAAC 


14400 




GTTTAAAAGA 


TGAAAATATT 


GGCATTAACT 


TATACAGTGG 


TGCCACAGTG 


CAAGAAGAAG 


14460 


15 


TTGGTTTGCG 


TGGTGCGAAA 


GTGGCAGCGA 


ATACGATTAA 


ACCAGACTTG 


GCGATAgcTG 


14520 




TCGATGTAGG 


TATTGCTTAT 


GATACCCCAG 


GTATGTCAGG 


TCAAACGAGC 


GATAGTAAAC 


14580 




TAGGCGGTGG 


TCCAGTTGTC 


ATTATGATGG 


ATGCTACAAG 


TATTGCTCAC 


CAAGGTTTGC 


14640 


20 


GAAAgcATaT 


TAAAGATGTA 


GCTAAGGAAC 


ATAACATCGA 


AGTACAATGG 


GATACGACAC 


14700 




CAGGTGGAGG 


TACAGATGCG 


GGAAGTATTC 


ATGTCGCAAA 


TGAAGGTATT 


CCAACGATGA 


14760 




CAATCGGTGT 


TACGCTGCGA 


TACATGCATT 


CTAATGTTTC 


AGTGCTCAAT 


GTAGATGATT 


14820 


25 


ATGAAAATTC 


TATCCGTCTT 


GTTACTGAAA 


TTGTCCGTTC 


ATTGAATGAT 


GAAAGTTATA 


14880 




AAAATATCAT 


GTGGTAATCA 


AATCCATAAA 


TAATAAAGAA 


TCCTTTTAAT 


ATGGTAGGTT 


14940 




GTTAAACAAT 


TGTCTAATTT 


TAATTCTTAG 


TCATTAGACA 


GTATCCATGT 


TAATAGGATT 


15000 


30 


rriWlTATl 


AATTTAAATG 


CTGAAAATCA 


ATTATGCCTA 


AATTTTGATA 


TTACAAGAAA 


. 15060 




atgatttttt 


CTTAAATGTA 


ATTGCACTAA 


AAACCAAAAA 


AACGGGAATA 


ATATACCTGA 


15120 


35 


TATATTACAT 


GAGGAGCGGT 


GCAAATGTTG 


TTAGAAATTA 


AAGATTTAGT 


GTATAAAGCG 


15180 


AGCGATAGAA 


TCATACTAGA 


TCATATCAGT 


CTAAAAGTAG 


ATAAAGGCGA 


GAGTATTGCC 


15240 




ATT AT AGGT C 


CATCAGGTAG 


TGGTAAAAGT 


ACATTTCAAA 


AGCAAATATG 


TAATTTGTTT 


15300 


40 


AGTCCAACTA GTGGAGAACT 


TTATTTTAAA 


GGTAAACCCT 


ATAATGATTA 


TGACCCGGAA 


15360 


GAATTGCGTC 


AACGAATCAG 


TTATTTGATG 


CAGCAAAGTG 


ACTTUTTTGG 


TGAAACGATT 


15420 




GAAGATAACA 


TGATATTCCC 


ATCACTTGCA 


CGTAATGATA 


AATTTGATAG 


AAAACGTGCA 


15480 


45 


AAGCAATTAA 


TTAAAGATGT 


CGGTTTGGGA 


CATTATCAAT 


TAAGTTCGGA 


AGTGGAAAAT 


15540 




ATGTCGGGTG 


GTGAGCGGCA 


AAGAATTGCT 


ATAGCGCGCC 


AACTGATGTA 


TACACCGGAT 


15600 




ATTCTTTTAT 


TAGATGAATC 


GACCAGTGCA 


TTAGACGTTA 


ATAATAAAGA 


AAAGATAGAA 


15660 


50 


AATATCATTT 


TTAAATTAGC 


AGATCAAGGC 


GTGGCAATTA 


TGTGGATTAC 


CCACAGCGAT 


15720 




GACCAAAGTA 


TGCGACACTT 


TCAAAAGCGT 


ATAACAATTG 


TTGATGGTCA 


AATTTCTAAT 


15780 
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